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Abstract 

In  many  applications  (e.g.  recognition  of  geophysical  and  biomedical  signals 
and  multiscale  analysis  of  images),  it  is  of  interest  to  analyze  and  recognize 
phenomena  occuring  at  different  scales.  The  recently  introduced  wavelet  trans¬ 
forms  provide  a  time-and-scale  decomposition  of  signals  that  offers  the  possibil¬ 
ity  of  such  analysis.  At  present,  however,  there  is  no  corresponding  statistical 
framework  to  support  the  development  of  optimal,  multiscale  statistical  sig¬ 
nal  processing  algorithms.  In  this  paper  we  describe  such  a  framework.  The 
theory  of  multiscale  signal  representations  leads  naturally  to  models  of  signals 
on  trees,  and  this  provides  the  framework  for  our  investigation.  In  particular, 
in  this  paper  we  describe  the  class  of  isotropic  processes  on  homogenous  trees 
and  develop  a  theory  of  autoregressive  models  in  this  context.  This  leads  to 
generalizations  of  Schur  and  Levinson  recursions,  associated  properties  of  the 
resulting  reflection  coefficients,  and  the  initial  pieces  in  a  system  theory  for 
multiscale  modeling. 
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1  Introduction 


The  investigation  of  multi-scale  representations  of  signals  and  the  development  of  mul¬ 
tiscale  algorithms  has  been  and  remains  a  topic  of  much  interest  in  many  contexts. 
In  some  cases,  such  as  in  the  use  of  fractal  models  for  signals  and  images  [13,27]  the 
motivation  has  directly  been  the  fact  that  the  phenomenon  of  interest  exhibits  pat¬ 
terns  of  importance  at  multiple  scales.  A  second  motivation  has  been  the  possiblity 
of  developing  highly  parallel  and  iterative  algorithms  based  on  such  representations. 
Multigrid  methods  for  solving  partial  differential  equations  [14,23,28,30]  or  for  per¬ 
forming  Monte  Carlo  experiments  [18]  are  a  good  example.  A  third  motivation  stems 
from  so-called  “sensor  fusion”  problems  in  which  one  is  interested  in  combining  to¬ 
gether  measurements  with  very  different  spatial  resolutions.  Geophysical  problems, 
for  example,  often  have  this  character.  Finally,  renormalization  group  ideas,  from 
statistical  physics,  now  find  application  in  methods  for  improving  convergence  in 
large-scale  simulated  annealing  algorithms  for  Markov  random  field  estimation  [20]. 

One  of  the  more  recent  areas  of  investigation  in  multi-scale  analysis  has  been 
the  development  of  a  theory  of  multi-scale  representations  of  signals  [24,26]  and  the 
closely  related  topic  of  wavelet  transforms  [4,5,6,7,10,19,22].  These  methods  have 
drawn  considerable  attention  in  several  disciplines  including  signal  processing  because 
they  appear  to  be  a  natural  way  to  perform  a  time-scale  decomposition  of  signals  and 
because  examples  that  have  been  given  of  such  transforms  seem  to  indicate  that  it 
should  be  possible  to  develop  efficient  optimal  processing  algorithms  based  on  these 
representations.  The  development  of  such  optimal  algorithms— e.g.  for  the  recon¬ 
struction  of  noise-degraded  signals  or  for  the  detection  and  localization  of  transient 
signals  of  different  duration — requires,  of  course,  the  development  of  a  corresponding 
theory  of  stochastic  processes  and  their  estimation.  The  research  presented  in  this 
and  several  other  papers  and  reports  [17,18]  has  the  development  of  this  theory  as  its 
objective. 

In  the  next  section  we  introduce  multi-scale  representations  of  signals  and  wavelet 
transforms  and  from  these  we  motivate  the  investigation  of  stochastic  processes  on 
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dyadic  trees.  In  that  section  we  also  introduce  the  class  of  isotropic  processes  on 
dyadic  trees  and  set  the  stage  for  introducing  dynamic  models  on  trees  by  describing 
their  structure  and  introducing  a  rudimentary  transform  theory.  In  Section  2  we 
also  introduce  the  class  of  autoregressive  (AR)  models  on  trees.  As  we  will  see,  the 
geometry  and  structure  of  a  dyadic  tree  is  such  that  the  dimension  of  an  AR  model 
increases  with  the  order  of  the  model.  Thus  an  nth  order  AR  model  is  characterize 
by  more  than  n  coefficients  whose  interdependence  is  specified  by  a  complex  relation 
and  the  passage  from  order  n  to  order  n  -f-  1  is  far  from  simple.  In  contrast,  in 
Section  3  we  obtain  a  far  simpler  picture  if  we  consider  the  generalization  of  lattice 
structures,  and  in  particular  we  find  that  only  one  reflection  coefficient  is  added 
as  the  order  is  increased  by  one.  The  latter  fact  leads  to  the  development  of  a 
set  of  scalar  recursions  that  provide  us  with  the  reflection  coefficients  and  can  be 
viewed  as  generalizations  of  the  Schur  and  Levinson  recursions  for  AR  models  of  time 
series.  These  recursions  are  also  developed  in  Section  3  as  are  the  constraints  that 
the  reflection  coefficients  must  satisfy  which  are  somewhat  different  than  for  the  case 
of  time  series.  In  Section  4  we  then  present  the  full  vector  Levinson  recursions  that 
provide  us  with  both  whitening  and  modeling  filters  for  AR  processes,  and  in  Section  5 
we  use  the  analysis  of  the  preceding  sections  to  provide  a  complete  characterization 
of  the  structure  of  autoregressive  processes  and  a  necessary  and  sufficient  condition 
for  an  isotropic  process  to  be  purely  nondeterministic.  The  paper  concludes  with  a 
brief  discussion  in  Section  6. 
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2  Multiscale  Representations  and  Stochastic 
Processes  on  Homogenous  Trees 

2.1  Multiscale  Representations  and  Wavelet  Transforms 

The  multi-scale  representation  [25,26]  of  a  continuous  signal  f(x )  consists  of  a  se¬ 
quence  of  approximations  of  that  signal  at  finer  and  finer  scales  where  the  approxi¬ 
mation  of  f(x )  at  the  mth  scale  is  given  by 

+oo 

f(x)  =  Y  /(m>  n)(j>(2mx  -  n)  (2.1) 

n=— oo 

As  m  — »  oo  the  approximation  consists  of  a  sum  of  many  highly  compressed,  weighted, 
and  shifted  versions  of  the  function  <j>(x)  whose  choice  is  far  from  arbitrary.  In  par¬ 
ticular  in  order  for  the  (m  +  l)st  approximation  to  be  a  refinement  of  the  mth,  we 
require  that  <j>{x)  be  exactly  representable  at  the  next  scale: 


<j>(x)  =  YHn)<f>(2x —  n)  (2.2) 

n 

Furthermore  in  order  for  (2.1)  to  be  an  orthogonal  series,  <j)(t )  and  its  integer  translates 
must  form  an  orthogonal  set.  As  shown  in  [7],  h(n )  must  satisfy  several  conditions  for 
this  and  several  other  properties  of  the  representation  to  hold.  In  particular  h(n )  must 
be  the  impulse  response  of  a  quadrature  mirror  filter  [7,31].  The  simplest  example  of 
such  a  <f>,h  pair  is  the  Haar  approximation  with 


<t>{x) 


1  0  <  x  <  1 
0  otherwise 


(2.3) 


and 


h(n ) 


1  n  =  0 
0  otherwise 


(2.4) 


Multiscale  representions  are  closely  related  to  wavelet  transforms.  Such  a  trans¬ 
form  is  based  on  a  single  function  x )  that  has  the  property  that  the  full  set  of  its 
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scaled  translates  ^2m^2ip(2mx  —  n)|  form  a  complete  orthonormal  basis  for  L2.  In  [7] 
it  is  shown  that  <f>  and  ip  are  related  via  an  equation  of  the  form 

Hx)  =  ~  »)  (2-5) 

n 

where  g(n)  and  h(n)  form  a  conjugate  mirror  filter  pair  [3],  and  that 

fm+i(x)  =  fm(x)  +  d(m >  n)ip{2mx  -  n)  (2.6) 

n 

fm(x )  is  simply  the  partial  orthonormal  expansion  of  f(x),  up  to  scale  m,  with  respect 
to  the  basis  defined  by  ip.  For  example  if  <p  and  h  are  as  in  eq.  (2.3),  eq.  (2.4),  then 

1  0  <  x  <  1/2 

ip(x)  =  -1  1/2  <x  <1  (2.7) 

0  otherwise 

1  n  =  0 

g(n)  =  <  -1  n  =  1  (2.8) 

0  otherwise 

and  ^2mf2ip{2mx  —  n)|  is  the  Haar  basis. 

From  the  preceding  remarks  we  see  that  we  have  a  dynamical  relationship  between 
the  coefficients  /(m,n)  at  one  scale  and  those  at  the  next.  Indeed  this  relationship 
defines  a  lattice  on  the  points  ( m,n ),  where  (m  +  1,&)  is  connected  to  (m,  n)  if 
/(m,  n)  influences  /(m  + 1,  k).  In  particular  the  Haar  representation  naturally  defines 
a  dyadic  tree  structure  on  the  points  (m,  n )  in  which  each  point  has  two  descendents 
corresponding  to  the  two  subdivisions  of  the  support  interval  of  <p( 2mx  —  n),  namely 
those  of  (p(2^m+1^x  —  2n)  and  <^(2^m+1^x  —2 n  —  1).  This  observation  provides  the 
motivation  for  the  development  of  models  for  stochastic  processes  on  dyadic  trees  as 
the  basis  for  a  statistical  theory  of  multiresolution  stochastic  processes. 

2.2  Homogenous  Trees 

Homogenous  trees,  and  their  structure,  have  been  the  subject  of  some  work  [1,2,3,12, 
16]  in  the  past  on  which  we  build  and  which  we  now  briefly  review.  A  homogenous 
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tree  T  of  order  q  is  an  infinite  acyclic,  undirected,  connected  graph  such  that  every 
node  of  T  has  exactly  (q  +  1)  branches  to  other  nodes.  Note  that  q  =  1  corresponds 
to  the  usual  integers  with  the  obvious  branches  from  one  integer  to  its  two  neighbors. 
The  case  of  q  =  2,  illustrated  in  Figure  2.1,  corresponds,  as  we  will  see,  to  the  dyadic 
tree  on  which  we  focus  throughout  the  paper.  In  2-D  signal  processing  it  would  be 
natural  to  consider  the  case  of  q  =  4  leading  to  a  pyramidal  structure  for  our  indexing 
of  processes. 

The  tree  T  has  a  natural  notion  of  distance:  d(s,t )  is  the  number  of  branches 
along  the  shortest  path  between  the  nodes  of  s,t  £  T  (by  abuse  of  notation  we  use 
T  to  denote  both  the  tree  and  its  collection  of  nodes).  One  can  then  define  the 
notion  of  an  isometry  on  T  which  is  simply  a  one-to-one,  onto  a  map  of  T  onto  itself 
that  preserves  distances.  For  the  case  of  q  —  1,  the  group  of  all  possible  isometries 
corresponds  to  translations  of  the  integers  ( t  i-+  t+k)  the  reflection  operation  (t  i— ►  —t) 
and  concatenations  of  these.  For  q  >  2  the  group  of  isometries  of  T  is  significantly 
larger  and  more  complex.  One  extremely  important  result  is  the  following  [12]: 

Lemma  2.1  (Extension  of  Isometries)  Let  T  be  a  homogenous  tree  of  order  q, 
let  A  and  A'  be  two  subsets  of  nodes,  and  let  f  be  a  local  isometry  from  A  to  A',  i.e. 
f  is  bijection  from  A  onto  A'  such  that 

d(f(s),f(t))  =  d(s,t )  for  all  s,t  £  A  (2-9) 

Then  there  exists  an  isometry  f  of  T  which  equals  f  when  restricted  to  A.  Further¬ 
more,  if  f  i  and  /2  are  two  such  extensions  of  f,  their  restrictions  to  segments  joining 
any  two  points  of  A  are  identical. 

Another  important  concept  is  the  notion  of  a  boundary  point  [1,16]  of  a  tree. 
Consider  the  set  of  infinite  sequences  of  T  where  any  such  sequence  consists  of  a 
sequence  of  distinct  nodes  ti,t2,...  where  d(L,ti+ 1)  =  1.  A  boundary  point  is  an 
equivalence  class  of  such  sequences  where  two  sequences  are  equivalent  if  they  differ 
by  a  finite  number  of  nodes.  For  q  =  1,  there  are  only  two  such  boundary  points 
corresponding  to  sequences  increasing  towards  -foo  or  decreasing  toward  — oo.  For 
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q  —  2  the  set  of  boundary  points  is  uncountable.  In  this  case  let  us  choose  one 
boundary  point  which  we  will  denote  by  — oo. 

Once  we  have  distinguished  this  boundary  point  we  can  identify  a  partial  order  on 
T.  In  particular  note  that  from  any  node  t  there  is  a  unique  path  in  the  equivalence 
class  defined  by  —  oo  (i.e.  a  unique  path  from  t  “toward”  — oo).  Then  if  we  take  any 
two  nodes  s  and  t,  their  paths  to  — oo  must  differ  by  only  a  finite  number  of  points 
and  thus  must  meet  at  some  node  which  we  denote  by  s  A  t  (see  Figure  2.1.  We  then 
can  define  a  notion  of  the  relative  distance  of  two  nodes  to  —  oo 

S(s,t)  =  d(s,s  At)  —  d(t,s  At)  (2-10) 

so  that 

s  ■<  t  (“s  is  at  least  as  close  to  — oo  as  t”)  if  6(s,t)  <  0  (2-11) 

s  ~<t  (“s  is  closer  to  — oo  than  t”)  if  S(s,t)  <  0  (2.12) 

This  also  yields  an  equivalence  relation  on  nodes  of  T : 

s~t  <->6(s,t)  =  0  (2.13) 

For  example,  the  points  s,  t,  and  u  in  Figure  2.1  are  all  equivalent.  The  equivalence 
classes  of  such  nodes  are  referred  to  as  horocycles. 

These  equivalence  classes  can  best  be  visualized  as  in  Figure  2.2  by  redrawing  the 
tree,  in  essence  by  picking  the  tree  up  at  — oo  and  letting  the  tree  “hang”  from  this 
boundary  point.  In  this  case  the  horocycles  appear  as  points  on  the  same  horizontal 
level  and  s  ^  t  means  that  s  lies  on  a  horizontal  level  above  or  at  the  level  of  t. 
Note  that  in  this  way  we  make  explicit  the  dyadic  structure  of  the  tree.  With  regard 
to  multiscale  signal  representations,  a  shift  on  the  tree  toward  — oo  corresponds  to 
a  shift  from  a  finer  to  a  coarser  scale  and  points  on  the  same  horocycle  correspond 
to  the  points  at  different  translational  shifts  in  the  signal  representation  at  a  single 
scale.  Note  also  that  we  now  have  a  simple  interpretation  for  the  nondenumerability 
of  the  set  of  boundary  points:  they  correspond  to  dyadic  representations  of  all  real 
numbers. 
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2.3  Shifts  and  Transforms  on  T 

The  structure  of  Figure  2.2  provides  the  basis  for  our  development  of  dynamical 
models  on  trees  since  it  identifies  a  “time-like”  direction  corresponding  to  shifts  toward 
or  away  from  —  oo.  In  order  to  define  such  dynamics  we  will  need  the  counterpart  of 
the  shift  operators  z  and  z~l  in  order  to  define  shifts  or  moves  in  the  tree.  Because  of 
the  structure  of  the  tree  the  description  of  these  operators  is  a  bit  more  complex  and 
in  fact  we  introduce  notation  for  five  operators  representing  the  following  elementary 
moves  on  the  tree,  which  are  also  illustrated  in  Figure  2.3 

•  0  the  identity  operator  (no  move) 

•  7-1  the  backward  shift  (move  one  step  toward  — oo) 

•  a  the  left  forward  shift  (move  one  step  away  from  — oo  toward  the  left) 

•  (3  the  right  forward  shift  (move  one  step  away  from  — oo  toward  the  right) 

•  8  the  intercharge  operator  (move  to  the  nearest  point  in  the  same  horocycle) 

Note  that  0  and  8  are  isometries;  a  and  j3  are  one-to-one  but  not  onto;  7-1  is  onto 
but  not  one-to-one;  and  these  operators  satisfy  the  following  relations  (where  the 
convention  is  that  the  right-most  operator  is  applied  first): 


la  =  7  1/3  —  0 

(2.14) 

1 

Oo 

II 

(2.15) 

O 

II 

C* 

(2.16) 

8/3  =  a 

(2.17) 

Arbitrary  moves  on  the  tree  can  then  be  encoded  via  finite  strings  or  words  using 
these  symbols  as  the  alphabet  and  the  formulas  (2.14)-(2.17).  For  example,  referring 
to  Figure  2.3 

si  =  7_4t  ,  s2  =  8^~H  ,  s3  =  a8j~3t 
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s4  =  01/367  H  ,  ss  =  /32a67  H  (2.18) 

It  is  also  possible  to  code  all  points  on  the  tree  via  their  shifts  from  a  specified, 
arbitrary  point  t0  taken  as  origin.  Specifically  define  the  language 

C  =  (7-1)*  U  {q,/3}*«(7-1)-  U  {q„8}*  (2.19) 

where  K*  denotes  arbitrary  sequences  of  symbols  in  K  including  the  empty  sequence 
which  we  identify  with  the  operator  0.  Then  any  point  t  €  T  can  be  written  as  ut0, 
where  u>  G  £.  Note  that  the  moves  in  £  are  of  three  types:  a  pure  shift  back  toward 
—00  ((7-1)*);  a  pure  descent  away  from  —00  ({a,  0}*);  and  a  shift  up  followed  by  a 
descent  down  another  branch  of  the  tree  ({a,/3}*8{ 7-1)*).  Our  use  of  6  in  the  last 
category  of  moves  ensures  that  the  subsequent  downward  shift  is  on  a  different  branch 
than  the  preceding  ascent.  This  emphasizes  an  issue  that  arises  in  defining  dynamics 
on  trees.  Specifically  we  will  avoid  writing  strings  of  the  form  07“ 1  or  /?7-1.  For 
example  a^~H  either  equals  t  or  St  depending  upon  whether  t  is  the  left  or  right 
immediate  descendant  of  another  node.  By  using  8  in  our  language  we  avoid  this 
issue.  One  price  we  pay  is  that  £  is  not  a  semigroup  since  vuj  need  not  be  in  £  for 
v,u  €  £.  However,  for  future  reference  we  note  that,  using  (2.14)-(2.17)  we  see  that 
8u )  and  7-1u;  are  both  in  £  for  any  u>  €  £. 

It  is  straightforward  to  define  a  length  |w|  for  each  word  in  £,  corresponding  to 
the  number  of  shifts  required  in  the  move  specified  by  u>.  Note  that 


17  *1  =  \a  |  =  |0|  =  1 

|0|  =  0  ,  \8\  =  2  (2.20) 

Thus  |7-n|  =  n,  \tjJap\  =  the  number  of  a’s  and  0’s  in  ojap  €  {a/3}*,  and  \u:ap87~n\  = 
| uap\  +  2  +  n.3  This  notion  of  length  will  be  useful  in  defining  the  order  of  dynamic 
models  on  T.  We  will  also  be  interested  exclusively  in  causal  models,  i.e.  in  models  in 
which  the  output  at  some  scale  (horocycle)  does  not  depend  on  finer  scales.  For  this 
reason  we  are  most  interested  in  moves  the  either  involve  pure  ascents  on  the  tree,  i.e. 
3Note  another  consequence  of  the  ambiguity  in  aj~1:  its  “length”  should  either  be  0  or  2. 
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all  elements  of  {7-1}*,  or  elements  a >ap8>1~n  °f  {a,  in  which  the  descent 

is  no  longer  than  the  ascent,  i.e.  \wap\  <  n.  We  use  the  notation  u:  X  0  to  indicate 
that  u)  is  such  a  causal  move.  Note  that  we  include  moves  in  this  causal  set  that  are 
not  strictly  causal  in  that  they  shift  a  node  to  another  on  the  same  horocycle.  We 
use  the  notation  uxO  for  such  a  move.  The  reasons  for  this  will  become  clear  when 
we  examine  autoregressive  models. 

Also,  on  occasion  we  will  find  it  useful  to  use  a  simplified  notation  for  particular 
moves.  Specifically,  we  define  8^  recursively,  starting  with  <5^  =  8  and 

If  t  =  o:7-1t,  then  S^nH  =  a^n-1^7-1t 

If  t  =  £7 ~H,  then  6(nH  =  (2.21) 

What  8 does  is  to  map  t  to  another  point  on  the  same  horocycle  in  the  following 
manner:  we  move  up  the  tree  n  steps  and  then  descend  n  steps;  the  first  step  in  the 
descent  is  the  opposite  of  the  one  taken  on  the  ascent,  while  the  remaining  steps  are 
the  same.  That  is  if  t  =  rna^~n+1t  then  8^nH  =  map8'y~n+1t.  For  example,  referring 
to  Figure  2.3,  se  =  8^4h. 

With  the  notation  we  have  defined  we  can  now  define  transforms  as  a  way  in  which 
to  encode  convolutions  much  as  z-transforms  do  for  temporal  systems.  In  particular 
we  consider  systems  that  are  specified  via  noncommutative  formal  power  series  [11] 
of  the  form: 

S='52su-u>  (2.22) 

u>e  c 

If  the  input  to  this  system  is  ut,t  6  T,  then  the  output  is  given  by  the  generalized 
convolution: 

(Su)t  =  £  swu„t  (2.23) 

wee 

For  future  reference  we  use  the  notation  5(0)  to  denote  the  coefficient  of  the  empty 
word  in  5.  Also  it  will  be  necessary  for  us  to  consider  particular  shifted  versions  of 

5: 

7 5  =  s7- iu  •  w  (2.24) 

wee 
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(2.25) 


${k)S  =  53  ssWw 

oj€lC 

where  we  use  (2.14)-(2.17)  and  (2.21)  to  write  7_1u>  and  6^uj  as  elements  of  C. 

2.4  Isotropic  Processes  on  Homogenous  Trees 

Consider  a  zero-mean  stochastic  process  Yt,t  E  T  indexed  by  nodes  on  the  tree.  We 
say  that  such  a  process  is  isotropic  if  the  covariance  between  Y  at  any  two  points 
depends  only  on  the  distance  between  the  points,  i.e.  if  there  exists  a  sequence  rn,  n  = 
0, 1, 2, . . .  so  that 

E[YtYs]  =  rdM  (2.26) 

An  alternate  way  to  think  of  an  isotropic  process  is  that  its  statistics  are  invariant 
under  tree  isometries.  That  is,  if  /  $  T  — >  T  is  an  isometry  and  if  Yt  is  an  isotropic 
process,  then  Zt  =  F}(q  has  the  same  statistics  as  Yt.  For  time  series  this  simply 
states  that  K.t  and  Yt+k  have  the  same  statistics  as  Yt.  For  dyadic  trees  the  richness 
of  the  group  of  isometries  makes  isotropy  a  much  stranger  property. 

Isotropic  processes  have  been  the  subject  of  some  study  [1,2,12]  in  the  past,  and  in 
particular  a  spectral  theorem  has  been  developed  that  is  the  counterpart  of  Bochner’s 
theorem  for  stationary  time  series.  In  particular  Bochner’s  theorem  states  that  a 
sequence  rn,  n  =  0, 1, . . .  is  the  covariance  function  of  a  stationary  time  series  if  and 
only  if  there  exists  a  nonnegative,  symmetric  spectral  measure  S(dw )  so  that 

-  -  hL*~sw 

=  —  /  cos(um)S(dw)  (2.27) 

7T  Jo 

If  we  perform  the  change  of  variables  x  =  cos  u  and  note  that  cos  (n  uj)  =  Cn(cos  u), 
where  Cn(x )  is  the  nth  Chebychev  polynomial,  we  have 

r„  =  J\  Cn(x)p(dx)  (2.28) 
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where  p(dx )  is  a  nonnegative  measure  on  [—1,1]  (also  referred  to  as  the  spectral 
measure)  given  by 

n(dx)  =  —  (1  —  x2)~^S(duj)  (2.29) 

7T 

For  example,  for  the  white  noise  sequence  with  rn  =  6n0, 

n{dx)  =  i(l  —  x2)~%  (2.30) 

7 r 

The  analogous  theorem  for  isotropic  processes  on  dyadic  trees  requires  the  intro¬ 
duction  of  the  Dunau  polynomials  [2,12]: 

Po(x)  =  l  ,  P1(x)  =  x  (2.31) 

xPn(x)  =  ^Pn+i(x)  +  ipn_!(a;)  (2.32) 

Theorem  2.1  [1,2]:  A  sequence  rn,n  =  0,1,2,...  is  the  covariance  function  of  an 
isotropic  process  on  a  dyadic  tree  if  and  only  if  there  exists  a  nonnegative  measure  p 
on  [—1, 1]  so  that 

rn  =  J  ^Pn(x)p(dx)  (2.33) 

The  simplest  isotropic  process  on  the  tree  is  again  white  noise,  i.e.  a  collection  of 
uncorrelated  random  variables  indexed  by  T,  with  rn  =  Sn0,  and  the  spectral  measure 
p  in  (2.33)  in  this  case  is  [12] 

p(dx)  =  (aO^  X_J  dx  (2.34) 

where  Xa{%)  is  the  characteristic  function  of  the  set  A.  A  key  point  here  is  that 
this  spectral  measure  is  smaller  than  the  interval  [—1,1].  This  appears  to  be  a  direct 
consequence  of  the  large  size  of  the  boundary  of  the  tree,  which  also  leads  to  the 
existence  of  a  far  larger  class  of  singular  processes  than  one  finds  for  time  series. 
While  Theorem  2.1  does  provide  a  necessary  and  sufficient  condition  for  a  sequence 
rn  to  be  the  covariance  of  an  isotropic  process,  it  doesn’t  provide  an  explicit  and  direct 
criterion  in  terms  of  the  sequence  values.  For  time  series  we  have  such  a  criterion 
based  on  the  fact  that  rn  must  be  a  positive  semi-definite  sequence.  It  is  not  difficult 
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to  see  that  rn  must  also  be  positive  semidefinite  for  processes  on  dyadic  trees:  form 
a  time  series  by  taking  any  sequence  Ytl,Yt2, . . .  where  d(tj,ti+1)  =  1;  the  covariance 
fuction  of  this  series  is  rn.  However,  thanks  to  the  geometry  of  the  tree  and  the 
richness  of  the  group  of  isometries  of  X,  there  are  many  additional  constraints  on  rn. 
For  example,  consider  the  three  nodes  s ,  it,  and  s  A  t  in  Figure  2.1,  and  let 


xr  = 

[Y„ 

YU,YS  m] 

(2.35) 

r0 

^2 

r2 

E[XXt]  = 

r2 

r0 

r2 

>  o 

(2.36) 

.  r2 

r2 

r0  . 

which  is  a  constraint  that  is  not  imposed  on  covariance  functions  of  time  series. 
Collecting  all  of  the  constraints  on  rn  into  a  useful  form  is  not  an  easy  task.  However, 
as  we  develop  in  this  paper,  in  analogy  with  the  situation  for  time  series,  there  is 
an  alternative  method  for  characterizing  valid  covariance  sequences  based  on  the 
generation  of  a  sequence  of  reflection  coefficients  which  must  satisfy  a  far  simpler  set 
of  constraints  which  once  again  differ  somewhat  from  those  in  the  time  series  setting. 

2.5  Models  for  Stochastic  Processes  on  Trees 

As  for  time  series  it  is  of  considerable  interest  to  develop  white-noise-driven  models 
for  processes  on  trees.  The  most  general  input-output  form  for  such  a  model  is  simply 

^  =  E  (2.37) 

ser 

where  Wt  is  a  white  noise  process  with  unit  variance.  In  general  the  output  of  this 
system  is  not  isotropic  and  it  is  of  interest  to  find  models  that  do  produce  isotropic 
processes.  One  class  introduced  in  [1]  has  the  form 

Vt  =  Ylcd(s,t)Ws  (2.38) 

s€T 

To  show  that  this  is  isotropic,  let  ( s,t )  and  (s',  t')  be  two  pairs  of  points  such  that 
d(s,t)  =  d(s',  t').  By  Lemma  2.1  there  exists  an  isometry  /  so  that  f(s )  =  f(s'), 
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fit)  =  /(*')-  Then 


E  ^  ]  Cd(s' .u)C<i(t' .u) 

u 

=  ^ LiCd(s'J(u'))cd(t'J(u ')) 

u' 

=  Z)q(/(S), /(«'» cd(/(o./(«')) 

u' 

—  ^cd(s,u')Cd{t,u')  =  E\YsYt)  (2.39) 

u> 

The  class  of  systems  of  the  form  of  (2.38)  are  the  generalization  of  the  class  of  zero- 
phase  LTI  systems  (i.e.  systems  with  impulse  responses  of  the  form  h(t,  s)  =  h(\t— s|)). 
On  the  other  hand,  we  know  that  for  time  series  any  LTI  stable  system,  and  in 
particular  any  causal,  stable  system,  yields  a  stationary  output  when  driven  by  white 
noise.  A  major  objective  of  this  paper  is  to  find  the  class  of  causal  models  on  trees 
that  produce  isotropic  processes  when  driven  by  white  noise.  Such  a  class  of  models 
will  then  also  provide  us  with  the  counterpart  of  the  Wold  decomposition  of  a  time 
series  as  a  weighted  sum  of  “past”  values  of  a  white  noise  process. 

A  logical  starting  point  for  such  an  investigation  is  the  class  of  models  introduced 
in  Section  2.3 

Yt  =  (SW)t  ,  S  =  su>  •  u  (2-40) 

However,  it  is  not  true  that  Yt  is  isotropic  for  an  arbitrary  choice  of  S.  For  example 
if  S  =  1  +  cry-1,  it  is  straighforward  to  check  that  Yt  is  not  isotropic.  Thus  we  must 
look  for  a  subset  of  this  class  of  models.  As  we  will  see  the  correct  model  set  is  the 
class  of  autoregressive  (AR)  processes,  where  an  AR  process  of  order  p  has  the  form 

Yt=J2  OvYvt  +  aWt  (2.41) 

k[<p 

where  Wt  is  a  white  noise  with  unit  variance. 

The  form  of  (2.41)  deserves  some  comment.  First  note  that  the  constraints  placed 
on  u  in  the  summation  of  (2.41)  state  that  Yt  is  a  linear  combination  of  the  white 
noise  Wt  and  the  values  of  Y  at  nodes  that  are  both  at  distances  at  most  p  from 
T(M  <  p)  and  also  on  the  same  or  previous  horocycles  (a;  ■<  0).  Thus  the  model 
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(2.41)  is  not  strictly  “causal”  and  is  indeed  an  implicit  specification  since  values  of 
Y  on  the  same  horocycle  depend  on  each  other  through  (2.41)  (see  the  second-order 
example  to  follow).  A  question  that  then  arises  is:  why  not  look  instead  at  models  in 
which  Yt  depends  only  on  its  “strict”  past,  i.e.  on  points  of  the  form  7 ~nt.  As  shown 
in  Appendix  A,  the  additional  constraints  required  of  isotropic  processes  makes  this 
class  quite  small.  Specifically  consider  an  isotropic  process  Yt  that  does  have  this 
strict  dependence: 

OO 

=  (2-«) 
n= 0 

In  Appendix  A  we  show  that  the  coefficients  an  must  be  of  the  form 

an  =  <7  an  (2.43) 

so  that  the  only  process  with  strict  past  dependence  as  in  (2.42)  is  the  AR(1)  process 

Yt  =  aYy-it  +  oWt  (2.44) 

Consider  next  the  AR(2)  process,  which  specializing  (2.41),  has  the  form 

Yt  —  o,iYy-it  -f-  a-YY^-ii  +  a^Yst  +  crWt  (2.45) 

Note  first  that  this  is  indeed  an  implicit  specification,  since  if  we  evaluate  (2.45)  at 
St  rather  than  t  we  see  that 

Yst  =  GiY^it  +  a,2Yy-2t  -f  a^Yt  +  cr Wst  (2.46) 

We  can,  of  course,  solve  the  pair  (2.45),  (2.46)  to  obtain  the  explicit  formulae 

y‘  =  (T^l)y-"+(r^l)^*+ffK  (247) 

-  (t^i)  ^ + (t^i)  (2-48> 

where 

Vt  =  —^{Wt  +  a3W5t}  (2.49) 

1-^3 
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Note  that  Vt  is  correlated  with  V$t  and  is  uncorrelated  with  other  values  of  V  and 
thus  is  not  an  isotropic  process  (since  E  \VtV^-2t]  ^  E  [V^V^]).  Thus  while  the  explicit 
representation  (2.47)-(2.48)  may  be  of  some  value  in  some  contexts  (e.g.  in  [17]  we 
use  similar  nonisotropic  models  to  analyze  some  estimation  problems),  the  implicit 
characterization  (2.45)  is  the  more  natural  choice  for  a  generalization  of  AR  modeling. 

Another  important  point  to  note  is  that  the  second-order  AR(2)  model  has  four 
coefficients— three  a’s  and  cr,  while  for  time  series  there  would  only  be  two  a’s.  Indeed 
a  simple  calculation  shows  that  our  AR(p)  model  has  (2P  — 1)  a’s  and  one  er  in  contrast 
to  the  p  a’s  and  one  a  for  time  series.  On  the  other  hand,  the  coefficients  in  our  AR 
model  are  not  independent  and  indeed  there  exist  nonlinear  relationships  among  the 
coefficients.  For  example  for  the  second-order  model  (2.45)  a3  =£■  0  if  a2  ^  0  since 
we  know  that  the  only  isotropic  process  with  strict  past  dependence  is  AR(1).  In 
Appendix  B  we  show  that  the  coefficients  al5  a2,  and  a3  in  (2.45)  are  related  by  a 
4th-order  polynomial  relation. 

Because  of  the  complex  relationship  among  the  aw’s  in  (2.41),  the  representation 
is  not  a  completely  satisfactory  parameterization  of  this  class  of  models.  As  we  will 
see  in  subsequent  sections,  an  alternate  parametrization,  provided  by  a  generalization 
of  Schur  and  Levinson  recursions,  provides  us  with  a  much  better  parametrization. 
In  particular  this  parametrization  involves  a  sequence  of  reflection  coefficients  for  AR 
processes  on  trees  where  exactly  one  new  reflection  coefficient  is  added  as  the  AR 
order  is  increased  by  one. 
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3  Reflection  Coefficients  and  Levinson  and  Schur 


Recursions  for  Isotropic  Trees 

As  outlined  in  the  preceding  section  the  direct  parametrization  of  isotropic  AR  models 
in  terms  of  their  coefficients  {aw}  is  not  completely  satisfactory  since  the  number  of 
coefficients  grows  exponentially  with  the  order  p,  and  at  the  same  time  there  is  a 
growing  number  of  nonlinear  constraints  among  the  coefficients.  In  this  section  we 
develop  an  alternate  characterization  involving  one  new  coefficient  when  the  order 
is  increased  by  one.  This  development  is  based  on  the  construction  of  “prediction” 
filters  of  increasing  order,  in  analogy  with  the  procedures  developed  for  time  series 
[8,9]  that  lead  to  lattice  filter  models  and  whitening  filters  for  AR  processes.  As  is  the 
case  for  time  series,  the  single  new  parameter  introduced  at  each  stage,  which  we  will 
also  refer  to  as  a  reflection  coefficient,  is  not  subject  to  complex  constraints  involving 
reflection  coefficients  of  other  orders.  Therefore,  in  contrast  to  the  case  of  time  series 
for  which  either  the  reflection  coefficient  representation  or  the  direct  parametrization 
in  terms  of  AR  coefficients  are  “canonic”  (i.e.  there  are  as  many  degrees  of  freedom  as 
there  are  coefficients),  the  reflection  coefficient  representation  for  processes  on  trees 
appears  to  be  the  only  natural  canonic  representation.  Also,  as  for  time  series,  we 
will  see  that  each  reflection  coefficient  is  subject  to  bounds  on  its  value  which  capture 
the  constraint  that  rn  must  be  a  valid  covariance  function  of  an  isotropic  process. 
Since  this  is  a  more  severe  and  complex  constraint  on  rn  than  arises  for  time  series, 
one  would  expect  that  the  resulting  bounds  on  the  reflection  coefficients  would  be 
somewhat  different.  This  is  the  case,  although  somewhat  surprisingly  the  constraints 
involve  only  a  very  simple  modification  to  those  for  time  series. 

As  for  time  series  the  recursion  relations  that  yield  the  reflection  coefficients  arise 
from  the  development  of  forward  and  backward  prediction  error  filters  for  Yt.  One  cru¬ 
cial  difference  with  time  series  is  that  the  dimension  of  the  output  of  these  prediction 
error  filters  increases  with  increasing  filter  order.  This  is  a  direct  consequence  of  the 
structure  of  the  AR  model  (2.41)  and  the  fact  that  unlike  the  real  line,  the  number  of 
points  a  distance  p  from  a  node  on  a  tree  increases  geometrically  with  p.  For  example, 


17 


from  (2.45)-(2.49)  we  see  that  Yt  and  YSt  are  closely  coupled  in  the  AR(2)  model,  and 
thus  their  prediction  might  best  be  considered  simultaneously.  For  higher  orders  the 
coupling  involves  (a  geometrically  growing  number  of)  additional  Y’ s.  In  this  section 
we  set  up  the  proper  definitions  of  these  vectors  of  forward  and  backward  prediction 
variables,  and,  thanks  to  isotropy,  deduce  that  only  one  new  coefficient  is  needed  as 
the  filter  order  is  increased  by  one.  This  leads  to  the  desired  scalar  recursions.  In  the 
next  section  we  use  the  prediction  filter  origin  of  these  recursions  to  construct  lattice 
forms  for  modeling  and  whitening  filters.  Because  of  the  variation  in  filter  dimension 
the  lattice  segments  are  somewhat  more  complex  and  capture  the  fact  that  as  we 
move  inward  toward  a  node,  dimensionality  decreases,  while  it  increases  if  we  expand 
outward. 

3.1  Forward  and  Backward  Prediction  Errors 

Let  Yt  be  an  isotropic  process  on  a  tree,  and  let  7Y{-  •  •}  denote  the  linear  span  of 
the  random  variables  indicated  between  the  braces.  As  developed  in  [9],  the  basic 
idea  behind  the  construction  of  prediction  models  of  increasing  orders  for  time  series 
is  the  construction  of  the  past  of  a  point  t  :  34, n  =  'H  {14_fc|0  <  k  <  n}  and  the 
consideration  of  the  sequences  of  spaces  as  n  increases.  In  analogy  with  this,  we  define 
the  past  of  the  node  t  on  our  tree: 

34, »  =  H  {Ywt  :  w  <  0,  H  <  n}  (3.1) 

One  way  to  think  of  the  past  for  time  series  is  to  take  the  set  of  all  points  within  a 
distance  n  of  t  and  then  to  discard  the  future  points.  This  is  exactly  what  (3.1)  is: 
34, n  contains  all  points  34  on  previous  horocycles  (s  >-  t )  and  on  the  same  horocycle 
(s  ~  t )  as  long  as  d(s,  t)  <n.  A  critical  point  to  note  is  that  in  going  from  34,n-i  to 
34, n  we  add  new  points  on  the  same  horocycle  as  t  if  n  is  even  but  not  if  n  is  odd  (see 
the  example  to  follow  and  Figures  3. 1-3.4). 

In  analogy  with  the  time  series  case,  the  backward  innovations  or  prediction  errors 
are  defined  as  the  variables  spanning  the  new  information,  fFt,n,  in  34, n  not  contained 
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in  yt,n- 1- 

34,n  =  yt,n- 1  ®  F (,n  (3-2) 

so  that  Tt,n  is  the  orthogonal  complement  of  34, n- 1  in  34, n  which  we  also  denote  by 
F t,n  —  34,  n  ©  34,n-i-  Define  the  backward  prediction  errors  for  the  “new”  elements  of 
the  “past”  introduced  at  the  nth  step,  i.e.  for  |u>|  X  0  and  |u>|  =  n,  define 

Ft,n{w)  =  Ywt  —  E  (K,t|34,n-l)  (3.3) 

where  E(x\y )  denotes  the  linear  least-squares  estimate  of  x  based  on  data  spanning 
y.  Then 

Ft  ,n  =  H  {T),n(tf)  '  \w\  —  n,w  -<  0}  (3.4) 

For  time  series  the  forward  innovations  is  the  the  difference  between  Yt  and  its 
estimate  based  on  the  past  of  Yt- 1-  In  a  similar  fashion  define  the  forward  innovations 

Et,n{w)  =  Yurt—  E  (Ywt\y-y-H,n- 1)  (3.5) 

where  u  ranges  over  a  set  of  words  such  that  ut  is  on  the  same  horocycle  as  t  and  at 
a  distance  at  most  n  —  1  from  t  (so  that  y^-i^n-i  is  Past  °f  that  point  as  well), 
i.e.  \w\  <  n  and  mxO.  Define 

£t,n  =  H  { Et,n(w )  :  |w|  <  n  and  isxO}  (3.6) 

Let  Et>n  denote  the  column  vector  of  the  Et,n(u).  a  simple  calculation  shows  that 

dimFJtin  =  (3-7) 

where  [«]  denotes  the  largest  integer  <  x.  The  elements  of  EtjTl  are  ordered  according 
to  a  dyadic  representation  of  the  words  w  for  which  |u;|  <  n,  w  X  0.  Specifically  any 
such  iv  other  than  0  must  have  the  form 


w  =  S(h)S(i2) 


(3.8) 


with 


1  5:  *1  <  *2  <•'•<**  < 


(3.9) 


19 


and  with  |u>|  =  2 **.  For  example  the  points  uit  for  u>  =  0 ,6,S^2\  and  SS^  are  illus¬ 
trated  in  Figure  3.44.  Thus  the  words  u>  of  interest  are  in  one-to-one  correspondence 
with  the  numbers  0  and  J2j=i  2‘J ,  which  provides  us  with  our  ordering. 

In  a  similar  fashion,  let  Ft>n  denote  the  column  vector  of  the  Ftt7l(u>).  In  this  case 

dim  Ft<n  =  2^  (3.10) 

The  elements  of  Ft,n  are  ordered  as  follows.  Note  that  any  word  u  for  which  \u\  —  n 
and  u  ■<  0  can  be  written  asw  =  u~f~k  for  some  k  <  0  and  w  xO.  For  example,  as 
illustrated  in  Figure  3.4,  for  n  =  5  the  set  of  such  a/s  is  (M2^-1,  SS^ 7-1,  fiy~3,  and 
7~5).  We  order  the  u>’s  as  follows:  first  we  group  them  in  order  of  increasing  k  and 
then  for  fixed  k  we  use  the  same  ordering  as  for  Etin  on  the  Cj. 

Example  3.1  In  order  to  illustrate  the  geometry  of  the  problem,  consider  the  cases 
n  =  1,2, 3,4,5.  The  first  two  are  illustrated  in  Figure  3.1  and  the  last  three  are  in 
Figures  3. 2-3. 4  respectively.  In  each  figure  the  points  comprising  Et,n  are  marked  with 
dots,  while  those  forming  Ft<n  are  indicated  by  squares. 

n  =  1  (See  Figure  3.1):  To  begin  we  have 

yt,  0  =  H{Yt} 

The  only  word  u  for  which  |u;|  =  1  and  u  ■<  0  is  u>  =  7-1.  Therefore 

Ft,  1  =  fi.ih-1) 

=  YrH-E(Yrt,\Yt) 

Also 

y-y-%  0  =  W{1W 

and  the  only  word  u  for  which  |a>|  <  1  and  w  x0  is  u  —  0.  Thus 

Et,i  =  -fi-tjCO) 

=  Yt-E(Yt\Y^t) 

4In  the  figure  these  points  appear  to  be  ordered  left-to  right  along  the  horocycle.  This,  however, 
is  due  only  to  the  fact  that  t  was  taken  at  the  left  of  the  horocycle. 
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n  =  2  (See  Figure  3.1):  Here 


yt,i  = 

In  this  case  \uj\  =  2  and  uj  -A  0  implies  that  uj  =  8  or  7-1.  Thus 

<>2  {  Ft>2( 7-2) 

-  E{YSi\YuY^t) 

F7-2t  -  £J(F7-2,|F(,y7-lt) 

Similarly , 

y^t,i  =  H{Y,-it,Y,-2t} 

and  0  is  the  only  word  satisfying  |u>|  <  2  and  uj  x  0.  Hence 

Et,2  =  Et^{  0) 

=  Yt  -  E  (Yt\Y^-H,Y^-2t) 

n  =  3  (See  Figure  3.2)  In  this  case 

yt,2  =  H  {Yt,  Yy-lt,  Yy-2t,  Yst} 


Also 


t,  3  = 


/  Etdh-1)  ) 
l  ^,3(7-3)  ) 

(  Yty-it  -  E(Ys^-u\Yt,  y7-lt, Yy-2t,  r5t)  \ 
\  Yy-3t  -  E(Y^t\Yt,Y^t,Y^2t,Y6t)  ) 


3^7  —  :li,2  H  )Y~f-lf,Yy—2f,Yy-3f,Ygy  —  lf) 


and  there  are  two  words,  namely  0  and  S,  satisfying  |w(  <  3  and  uj  x  0. 


E; 


t,  3 


/  Et, 3  (0)  \ 

1^,3  (S)) 

(Y  -  EiYlY^Y^Y^Ys^t)  \ 
\YSt  -  E(YSt\Y^t,Y^2UY^uYSn-H)  ) 
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n  =  4  (See  Figure  3.3) 


34,3 


^  ^7"2ti  3^4,  Ky-3t,  l57-li} 


'  F,,4(*<2))  X 

p  e«,4(<«|2>) 

4  = 

-fU(«T2) 

^.,4(7-)  y 

3^y— 1t,3  {L-).— lj,  L-y— 2j,  Liy— 3^,  Ly— 4j,  1  j,  i^7— 2j} 


Ft, 4  = 


n  =  5  (See  Figure  3-4) 


EtA  0) 
£m(*) 


34,4  7~t  \Yf^Y^  If ,  Liy— 2j,  Ljj ,  -^7— 3t)  ^57— 1 1?  ^7“  4t)  Y$-y— 2i}  ^/£*57— 1<)  YfiS-y-  Jt  } 


(  FtA^h-1)  N 

FtAh~3) 

Ft,s(  T5) 


t,  5 


3*r  *M  7~t  \A*y  1ti  ^74  2i 5  ^'i~3ti  “Ksy-1* 5  Yy-if ,  yj7-2j,  ly-s^,  Yg~f~3f ,  L^£y— 2< ,  Yj3fiy-2f  } 


£f,5  = 


1  £.,5(0)  n 
£.,«(<) 
-E.,5(«2)) 

^  £«lS(M(2))  y 


Let  us  make  a  few  comments  about  the  structure  of  these  prediction  error  vectors. 
Note  first  that  for  n  odd,  dim  =  dim  Et,n,  while  for  n  even  dim  Ft,n  =  2  dim  E<in. 
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Indeed  for  n  even  Ft,n  includes  some  points  on  the  same  horocycle  as  t  (namely  u)t  for 
|u;|  =  n,  u>  X  0) — e.g.  for  n  =  2  Ft^{8)  is  an  element  of  Ft>2.  These  are  the  points  that 
are  on  the  backward-expanding  boundary  of  the  “past” .  At  the  next  stage,  however, 
these  points  become  part  of  Et<n — e.g.  for  n  =  3  Eti3(S)  is  an  element  of  Et$.  This 
captures  the  fact  mentioned  previously  that  as  the  order  of  an  AR  model  increases, 
an  increasing  number  of  points  on  the  same  horocycle  are  coupled. 

As  a  second  point,  note  that  we  have  already  provided  a  simple  interpretation 
(3.2)  of  Ft,n  as  an  orthogonal  complement.  As  for  time  series,  this  will  be  crucial  in 
the  development  of  our  recursions.  We  will  also  need  similar  representations  for  £)f„. 
It  is  straightforward  to  check  that  for  n  odd 

yt,n  ©  3^-1  i,n-l  =  £«,n  (3-11) 

(this  can  be  checked  for  n  =  1  and  3  from  Example  3.1),  while  for  n  even 

yt,n  ©  y-y~it,n—l  =  &t,n  ©  £$(y)t)n  (3.12) 

For  example  for  n  =  2  this  can  be  checked  from  the  calculations  in  Example  3.1  plus 
the  fact  that 

Est,2  =  Yst-E[Yst\Y1-lt,Y^t\ 

Finally,  it  is  important  to  note  that  the  process  Et,n  (for  n  fixed)  is  not  in  general 
an  isotropic  process  (we  will  provide  a  counterexample  shortly).  However,  if  Yt  is 
AR(p)  and  n  <p,  then,  after  an  appropriate  normalization  Et<n  is  white  noise.  This 
is  in  contrast  to  the  case  of  time  series  in  which  case  the  prediction  errors  for  all  order 
models  are  stationary  (and  become  white  if  n  <  p ).  In  the  case  of  processes  on  trees 
Ettn  has  statistics  that  are  in  general  invariant  with  respect  to  some  of  the  isometries 
of  T  but  not  all  of  them. 

3.2  Calculation  of  Prediction  Errors  by  Levinson  Recursions  on 
the  Order 

We  are  now  in  a  position  to  develop  recursions  in  n  for  the  Ft>n(w)  and  Et,n(u). 
Our  approach  follows  that  for  time  series  except  that  we  must  deal  with  the  more 
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complex  geometry  of  the  tree.  In  particular  because  of  this  geometry  and  the  changing 
dimensions  of  Ft,n  and  Et<n,  it  is  necessary  to  distinguish  the  cases  of  n  even  and  n  odd. 

n  even 

Consider  first  Ftin(<o)  for  |w|  =  n,  u>  ^  0.  There  are  two  natural  subclasses  for 
these  words  10.  In  particular  either  lo  -<  0  of  u>  X  0. 

Case  1:  Suppose  that  to  ~<  0.  Then  to  =  ary-1  for  some  to  ■<  0  with  |a>|  =  n  —  1. 
We  then  can  perform  the  following  computation,  using  (3.3)  and  properties  of 
orthogonal  projections: 

=  Et,n  (w'J  =  Yfry—i t  E  ( — l |3^ f,ra — 1 ) 

Yun/~1t  E  t,n—2)  E  i^urf~1t  1 3^4,71—1  ©  3^y—14,7i— 2) 

Using  (3.3)  (applied  at  7 -1t,  n  — 1)  and  (3.11)  (applied  at  the  odd  integer  n  — 1), 
we  then  can  compute 

•f*t,n(^)  =  -Ey— 1t,n—  1  (^h)  E  (^Ttry- 1t  \Et,n— l) 

=  -  E (F^-u,n-i{ti)\Et,n-i)  (3.13) 

where  the  last  equality  follows  from  the  orthogonality  of  Et,n-\  and  y^-n,n-2 
(from  (3.11)).  Equation  (3.13)  then  provides  us  with  a  recursion  for  Ftin(u>)  in 
terms  of  variables  evaluated  at  words  to  of  shorter  length 

Case  2:  Suppose  that  to  x  0.  Then,  since  |w|  =  n,  it  is  not  difficult  to  see  that 
to  =  a for  some  to  satisfying  |a> |  <  n,  Co  x  0  (for  example,  for  n  =  4,  the 
only  to  satisfying  |u;|  =  n  and  to  x  0  are  S2  and  SS2 — see  Example  3.1).  As  in 
Case  1  we  have  that 

Ft,n{w)  ~  3©5( §h— (Xu,S(-9'>t^'y~lt’n-2)  ~E  {y^%)t\yt,n-\  ©  3V-1t,n-2)  (3.14) 
Now  for  n  even  we  can  show  that 

3^7-14,71-2  =  3^-i5(f)tin_2 
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For  example  for  n  =  4  these  both  equal  \Y^-it,  Y^-2t,  Y^-3t,  Us7~Jt}-  Using  this 
together  with  (3.5)  and  the  orthogonality  of  Et,n-i  and  yy-it,n~2  we  can  reduce 
(3.14)  to 

Ft,n{w)  =  -  E  (^)l^n-i)  (3-15) 

which  again  expresses  each  Ft,n{w)  in  terms  of  prediction  errors  evaluated  at 
shorter  words.  As  an  additional  comment,  note  that  the  number  of  words 
satisfying  Case  1  is  the  same  as  the  number  for  Case  2  (i.e.  one-half  dim  Ft,n)* 
Consider  next  Et,n(w)  for  \w\  <  n  and  w  X  0.  In  this  case  we  compute 

Et,n{w)  —  Ywt  —  E  ( ^^u(i|DV'1f,n-2 )  _  E  (Ywt  ©  ^V"1*.”-2) 

=  Et,n-l(w)  -  E  (Et,n-1  (w) )  (3'16) 

where  the  last  equality  follows  from  (3.2). 


n  odd 

Let  us  first  consider  the  special  case  of  n  =  1  which  will  provide  the  starting  point 
for  our  recursions.  From  Example  3.1 


Ft, i  —  Y^-h  —  E  (Ky-itlFt) 

=  Y^-h  -  kxYt  =  Fy-itfl  -  hEtjo  (3-17) 


where  kx  is  the  first  reflection  coefficient,  exactly  as  for  time  series 

1  E  [Y^t]  r0 

Similarly 


Et,r  =  Yt  -  E  (YtlY.-it) 

—  Y%  —  kx  Yy-if  =  Et,o  kiF^-itfl 


(3.18) 


(3.19) 


Consider  next  the  computation  of  Ft>n{w )  for  n  >  3  and  odd.  Note  that  for  n  odd  it 
is  impossible  for  |ty|  =  n  and  wxO.  Therefore  the  condition 


u>|  =  n  and  w  -<  0 
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is  equivalent  to 


w  =  w'y  1  ,  |u)|  =  n  —  1  ,  w  ^  0 
Therefore,  proceeding  as  before, 

Et,n  (^)  =  Yury*1 1  E  {Ywy^t  |3^y_1t,n— 2)  E  (^u-y-1*  1  ©  0'7—1<,ra— 2) 

=  -f17-i<,n-i(wj)  —  ^  ^T!Y-it)ra_i(u;)|E<)n_i,Eg(n^i)^_i^  (3.20) 

where  the  last  equality  follows  from  (3.12)  applied  at  the  even  integer  n  —  1. 

Consider  next  the  computation  of  Et<n{w )  for  n  >  3  and  odd,  and  for  |tc|  <  n, 
w  x  0.  There  are  two  cases  (each  corresponding  to  one-half  the  components  of  Etln) 
depending  upon  whether  |ic|  is  n  —  1  or  smaller. 


Case  1:  Suppose  that  |u>|  <  n  —  1.  In  this  case  exactly  the  same  type  of  argument 
yields 

Et,n{w)  =  Et^x{w)  -  E  (£*,„-!(») li^-it.n-i)  (3.21) 

Case  2:  Suppose  that  |tn|  =  n  —  1.  In  this  case  w  =  wS (t1)  where  w  x  0  and 
computations  analogous  to  those  performed  previously  yield 

Et,n(w)  =  Es(n^i)t^(w)  -  E  n_i(u;)|F7-itjn_1^  (3.22) 

where  in  this  case  we  use  the  fact  that 


^  1i,n-2  y^_ls("^l)tn_2 


For  example  for  n  =  5  these  both  equal 


\Yy~X  t,  Yy—2fj  Yy-3t,  Ygy-lf,  Ygy~  2t } 


We  have  now  identified  six  formulas — (3.13),  (3.15),  (3.16),  (3.20),  (3.21),  and 
(3.22) — for  the  order-by-order  recursive  computation  of  the  forward  and  backward 
prediction  errors.  Of  course  we  must  still  address  the  issue  of  computing  the  pro¬ 
jections  defined  in  these  formulas.  As  we  make  explicit  in  the  next  subsection  the 
richness  of  the  group  of  isometries  and  the  constraints  of  isotropy  provide  the  basis  for 
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a  significant  simplification  of  these  projections  by  showing  that  we  need  only  compute 
projections  onto  the  local  averages  or  barycenters  of  the  prediction  errors.  Moreover, 
scalar  recursions  for  these  barycenters  provide  us  both  with  a  straightforward  method 
for  calculating  the  sequence  of  reflection  coefficients  and  with  a  generalization  of  the 
Schur  recursions. 

Finally,  as  mentioned  previously  Et>n  is  not,  in  general,  an  isotropic  process  unless 
Yt  is  AR(p)  and  n  >  p,  in  which  case  it  is  white  noise.  To  illustrate  this,  consider 
the  computations  of  E[Et,\Est,i\  and  E[Et,iEy-2t>i]  which  should  be  equal  if  Ety i  is 
isotropic.  From  (3.18),  (3.19)  we  find  that 

ElEuEsu)  =  r2-d 
ro 


while 


E  —  r2  —  + 


rf  ri(rxr2  -  r0r3) 


In  general  these  expressions  are  not  equal  so  that  Et, i  is  not  isotropic.  However,  from 
the  calculations  in  Appendix  A  we  see  that  these  expressions  are  equal  and  indeed 
Etf i  is  white  noise  if  Yt  is  AR(1).  A  stronger  result  that  we  state  without  proof  is 
that  Etjn,  suitably  normalized,  is  isotropic  for  all  n  >  p  if  and  only  if  Yt  is  AR(p). 


3.3  Projections  onto  S  and  T  and  their  Barycenters 

Let  us  define  the  average  values  of  the  components  of  the  prediction  errors: 


&t,n 

(3.23) 

|ai|<n,U)XO 

ft,n 

=  2-Ul  £  F,„  M 

(3.24) 

|io|=n,uiXO 


The  following  result  is  critical 

Lemma  3.1  The  six  collections  of  projections  necessary  for  the  order  recursive  com¬ 
putation  of  the  prediction  errors  for  all  required  words  w  and  w  can  be  reduced  to 
a  total  of  four  projections  onto  the  barycenters  of  the  prediction  error  vectors.  In 
particular, 
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For  n  even:  For  any  word  w'  such  that  |u/|  ==  n  —  1  and  for  any  word  w"  such  that 
|u/'|  <  n  and  w"  X  0,  we  have  that 

-E'(^7-1t,n-i(u;,)l-£'t,n-i)  =  E  (F5( f)tn_i {w")\Etfn_-^j  (3.25) 

=  ^(f-r1i,»-iNK„-i)  (3.26) 

=  E  (■^'6<f)i,n-l(^)le^ra-1)  (3.27) 

(refer  to  (3. 13),  (3. 15))  where  wq  is  any  of  the  w' .  Also  for  any  w  such  that 

\w\  <  n  and  w  x  0,  we  have  that 


E  (E^iw) \F^n_x)  =  E  {E^mf^n-i) 


(3.28) 


(refer  to  (3.16)). 

For  n  odd:  For  any  w'  and  w"  satisfying  the  constraints  |  •  |  <  n  and  ■  x  0  we  have 
that 


E(Et,  „-t(w')\Fr,  =  E{E^tn_^w")\F,-.t„_^  (3.29) 


=  (0)|/,-.  ,.„_,)  (3.30) 

(refer  to  (3.21),  (3. 22)).  In  addition  for  any  w  ■<  0  such  that  |uj|  =  n  —  1 


E  ^F7-it)ra_i(t2j)|F/iTl_i,  E^n^i^ n-1^  —  E  ^F7-iiin_1(u70)|-  ^e/ira_i 


(refer  to  (3.20))  where  w0  is  any  of  the  w. 


+ 


t,n— 1 


(3.31) 


These  results  rely  heavily  on  the  structure  of  the  dyadic  tree,  the  isometry  exten¬ 
sion  lemma,  and  the  isotropy  of  Y.  As  an  illustration  consider  the  cases  n  =  4  and 
5  illustrated  in  Figures  3.5  and  3.6.  Consider  n  =  4  first.  Note  that  the  distance 
relationships  of  each  of  the  elements  of  Fy-iti3  and  of  ES(2)t3  to  Etf 3  are  the  same. 
Furthermore  all  three  of  these  vectors  contain  errors  in  estimates  based  on  37,-1 1]2. 
Hence  because  of  this  symmetry  and  the  isotropy  of  Y,  the  projections  of  any  of  the 
elements  of  Fy- itj3  or  ES( 2)t  3  onto  Et< 3  must  be  the  same,  as  stated  in  (3.25).  Fur¬ 
thermore,  the  two  elements  of  Eti 3  have  identical  geometric  relationship  with  respect 
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to  the  elements  of  the  other  two  error  vectors.  Hence  the  projections  onto  Et<3  must 
weight  its  two  elements  equally,  i.e.  the  projection  must  depend  only  on  the  average 
of  the  two,  eti3,  as  stated  in  (3.26),  (3.27).  Similarly,  the  two  elements  of  F7-iti3 
have  identical  geometric  relations  to  each  of  the  elements  of  Etj3,  so  that  (3.28)  must 
hold.  Similar  geometric  arguments  apply  to  Figure  3.6  and  (3.29)-(3.31)  evaluated 
at  n  =  5.  Perhaps  the  only  one  deserving  comment  is  (3.31).  Note,  however,  in  this 
case  that  each  of  the  elements  of  Fy-it<A  has  the  same  geometric  relationship  to  all  of 
the  elements  of  Eti 4  and  ES(2)tA  and  therefore  the  projection  onto  the  combined  span 
of  these  elements  must  weight  the  elements  of  Et, 4  and  ES(2)tA  equally  and  thus  is  a 
function  of  /2. 

Proof  of  Lemma  3.1:  As  we  have  just  illustrated  the  ideas  behind  each  of  the 
statements  in  the  lemma  are  the  same  and  thus  we  will  focus  explicitly  only  on 
the  demonstration  of  (3.26).  The  other  formulas  are  then  obtained  by  analogous 
arguments. 

The  demonstration  of  (3.26)  depends  on  the  following  three  lemmas  which  are 
proved  in  Appendix  C  by  exploiting  symmetry  and  the  isometry  extension  lemma. 

Lemma  3.2  The  expectation 

Gt,n  =  E  (F^-ittn^i(w)\Etin-i)  (3.32) 

for  n  even  is  the  same  for  all  |ti>|  =  n  —  1,  w  •<  0. 

Lemma  3.3  The  expectation 

Ht,n  =  E  (Fy-ittn„1(w)Et>n-1(w'))  (3.33) 

is  the  same  for  all  |w|  =  n  —  1,  w  ■<  0  and  all  |u/|  <  n  and  w1  x  0. 

Lemma  3.4  The  covariance  J2e,u  °fEt,n  has  the  following  structure.  Let  XXcto?  •  •  • ,  ctd.) 
denote  a  2d  x  2d  covariance  matrix,  depending  upon  scalars  a0, . . . ,  aj  and  with  the 
following  recursively-defined  structure: 

]T(o:o)  =  «o  (3.34) 
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E  (a:0,  •  •  • ,  ad) 


S(a0,  adUd-i 

ajUd-x  S  (a05  •  •  ’ 


(3.35) 


where  Ude  is  a  2d  x  2d  matrix  all  of  whose  values  are  1  (i.e.  Ud  =  lrfl  J  where  1^  a 
2d -dimensional  vector  of  1  ’s).  Then  there  exist  numbers  ccq,  cti, . . . ,  a^-i]  so  that 


S E,n  =  E  ( 


«o,  •  • 


(3.36) 


From  Lemma  3.2  we  see  that  we  need  only  show  that  Gt,n  depends  only  on  e4i„_i . 
However,  from  Lemma  3.4  it  is  a  simple  calculation  to  verify  that  l[nzi]  is  an  eigen¬ 
vector  of  E E,n-  Then,  consider  any  X  £  £t,n-i  of  the  form 


X  =  J2  K'Et,n-i(w')  (3.37) 

|tu'|<n 

w'xO 


where 

22  =  0  (3.38) 

|u>'|<n 

w'xO 

Then,  since  is  also  as  in  (3.37)  but  with  all  A^,  equal,  we  have  that 


2^]E  (Xe*,n_a)  =  (\w>,  •  •  • ,  E^,nl  =  0  (3.39) 

\  jfT1/ 

Thus  we  have  an  orthogonal  decomposition  of  8t,n- i  into  the  space  spanned  by  X 
as  in  (3.37),  (3.38)  and  the  one- dimensional  subspace  spanned  by  et,n-i-  However, 
thanks  to  Lemma  3.3,  for  any  X  satisfying  (3.37),  (3.38) 

E  [F7-itfB_1(ti>)A']  =  ^  Awj  Htjn  =  0  (3.40) 


Thus  the  projection  (3.32)  is  equal  to  the  projection  onto  proving  our  result. 


Remark:  Lemma  3.4  allows  us  to  say  a  great  deal  about  the  structure  of  E E,n-  In 
particular  it  is  straightforward  to  verify  that  the  eigenvectors  of  E E,n  are  the  discrete 
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Haar  basis.  For  example  in  dimension  8  the  eigenvectors  are  the  columns  of  the 
matrix 


1 

72 

0 

0 

0 

1 

2 

0 

X 

575 

1 

575 

1 

75 

0 

0 

0 

1 

2 

0 

575 

1 

575 

0 

1 

75 

0 

0 

1 

2 

0 

X 

575 

1 

575 

0 

i 

“75 

0 

0 

1 

2 

0 

i 

575 

1 

575 

0 

0 

1 

75 

0 

0 

1 

2 

X 

575 

575 

0 

0 

i 

“75 

0 

0 

1 

2 

i 

575 

1 

575 

0 

0 

0 

1 

75 

0 

1 

2 

i 

575 

1 

575 

0 

0 

0 

i 

“75 

0 

1 

2 

1 

575 

1 

575  . 

(3.41) 


Also,  as  shown  in  Section  4.2  and  in  Appendix  D,  the  structure  of  S £,n  allows  us  to 
develop  an  extremely  efficient  procedure  for  calculating  Y,E  „  .  Indeed  this  procedure 
involves  a  set  of  scalar  computations  and  a  recursive  construction  similar  to  the 
iterative  construction  of  E(a0,  aj, . . .  ,«d),  with  a  total  complexity  of  (9 (/log  /),  where 
/  = 


3.4  Scalar  Recursions  for  the  Barycenters 

An  immediate  consequence  of  Lemma  3.1,  and  the  definitions  of  the  barycenters, 
and  the  computations  in  Section  3.2  is  the  following  recursions  for  the  barycenters 
themselves: 

For  n  even: 


€-t,n  —  ^t,n— 1  (3.42) 

ft,n  —  2  +  es(fb,n-l)  ~  ^  (/v-1*,™-!  +  le*>n_1)  (3-43) 

For  n  odd,  n  >  1: 

et,n  =  2  (e<-n~ 1  +  ~  2E  (e<’n_1  +  ^3'44) 

ft,n  =  ~  E  (^et,n-l  +  e5(2^I)t  (3.45) 
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ft,  i  =  Ft,\  £t,i  =  Et,  i 


(3.46) 


while  for  n  =  1, 


and  thus  (3.17)-(3.19)  provide  the  necessary  formulas. 

It  remains  now  to  compute  explicitly  the  projections  indicated  in  (3.42)-(3.45).  As 
the  following  result  states,  we  only  need  compute  one  number,  kn ,  at  each  stage  of  the 
recursion,  where  kn  is  the  correlation  coefficient  between  a  variable  being  estimated 
and  the  variable  on  which  the  estimate  is  based.  We’ve  already  seen  this  for  n  =  1  in 
(3. 17)— (3. 19),  which  yields  also  the  first  of  the  sequence  kn  which  we  refer  to  as  the 
reflection  coefficient  sequence. 

Lemma  3.5  For  n  even: 


Ci,n  —  €-t,n— 1  knf~f— 1  (3.47) 

ft,n  =  2  +  eg(fh,n- 1)  ~~  ^ net,n-l  (3.48) 

where 

kn  —  cor  (e{jtl_  i,  f^y— lt,n— l) 

=  COr(es(  fh,n_l’e<*n— a) 

=  cor  (ye£(%)t:n_r,,f'f-it,n-i)  (3.49) 

and  cor  (x,y)  =  E(xy)/  [E(x2)E(y2)]1^2 . 


For  n  odd: 


where 


&t,n  —  2  n—l)  knf'y~1t,n—l 

ft,n  =  1  ~  2^«  (e*,n-l  + 

kn  =  cor(^~  (et,n-\  +  ,/y-ii.n-i) 


(3.50) 


(3.51) 


(3.52) 


Keys  to  proving  this  result  are  the  following  two  lemmas,  the  first  of  which  is 
proven  in  Appendix  C  and  the  second  of  which  can  be  proven  in  an  analogous  manner: 
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Lemma  3.6  For  n  odd: 


E  (y«p>,  J  =  E  (O =  E  (#-.*)  =  (3-53) 

Lemma  3.7  For  n  even  \  (ef>n  +  e^)t  )  and  fy-it,n  have  the  same  variance. 


Proof  of  Lemma  3.5  We  begin  with  the  case  of  n  even.  Since  n  —  1  is  odd,  Lemma 
3.6  yields 

E  =  E  (ej,v  J  =  E  *  <*_,  (3.54) 

From  (3.42)-(3.43)  we  than  see  that  (3.47)-(3.49)  are  correct  if 

E  [et,n-lf'y~1t,n—l]  =  E  [e5^h,n-ie(’n-1] 

=  E  [e6Wt,n-l =  0n-l  (3-55) 

so  that 

K  =  T1  (3-56) 

^n-1 

However,  the  first  equality  in  (3.55)  follows  directly  from  Lemma  3.1  while  the  second 
equality  results  from  the  first  with  t  replaced  by  S^H  and  the  fact  that 


?T-1t,n-l  =  (3‘57) 

For  n  odd  the  result  directly  follows  from  Lemma  3.7  and  (3.44), (3.45). 
Corollary:  The  variances  of  the  bary centers  satisfy  the  following  recursions.  For  n 
even 


=  £(«?*)  =  (i  -  *2)  <*-.  (3-58) 

=  E  (/’„)  =  (l±ll  -  «)  <72.,  (3.59) 

where  kn  must  satisfy 

-  ^  <  <  1  (3.60) 

For  n  odd 

=  (l  -  *2)  (3.61) 
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where 


-  1  <  kn  <  1 


(3.62) 


Proof:  Equation  (3.58)  follows  directly  from  (3.47)  and  (3.49)  and  the  standard 
formulas  for  the  estimation  variance.  Equation  (3.59)  follows  in  a  similar  way  from 
(3.48)  and  (3.49)  where  the  only  slightly  more  complex  feature  is  the  use  of  (3.49)  to 
evaluate  the  mean-squared  value  of  the  term  in  parentheses  in  (3.48).  Equation  (3.61) 
follows  in  a  similar  way  from  (3.50)-(3.52)  and  Lemma  3.7.  The  constraints  (3.60) 
and  (3.62)  are  immediate  consequences  of  the  nonnegativity  of  the  various  variances. 

As  we  had  indicated  previously,  the  constraint  of  isotropy  represents  a  significantly 
more  severe  constraint  on  the  covariance  sequence  rn.  It  is  interesting  to  note  that 
these  additional  constraints  manifest  themselves  in  the  simple  modification  (3.60) 
of  the  constraint  on  kn  for  n  even  over  the  form  (3.62)  that  one  also  finds  in  the 
corresponding  theory  for  time  series.  Also,  as  in  the  case  of  time  series  the  satisfaction 
of  (3.60)  or  (3.62)  with  equality  corresponds  to  the  class  of  deterministic  or  singular 
processes  for  which  perfect  prediction  is  possible.  We  will  have  more  to  say  about 
these  and  related  observations  in  Section  5. 

3.5  Schur  Recursions  and  Computation  of  the  Reflection 
Coefficients 

We  now  need  to  address  the  question  of  the  explicit  computation  of  the  reflection 
coefficients.  The  key  to  this  result  is  the  following 

Lemma  3.8  For  n  even: 


=  £[KA-.,,-x] 

=  E 

Y*eWt,n- 1. 

(3.63) 

*»— 1  =  E  [cjn_i] 

—  E  [L*e<)n_  1] 

(3.64) 
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For  n  odd 


lf'y-' 1t,n— l)  —  E  | 

=  E[Ytfy-lt,  „_a] 

r  i  _ 

If  \2' 

[/y-^n-l]  =  E 

4{et’n-1+es(^t,n-l) 

(3.65) 


Wvi)  +  E  (y.e^,^^)]  (3.66) 


Proof:  This  result  is  essentially  a  consequence  of  other  results  we  have  derived 
previously.  For  example,  for  n  even,  since  /7-it,n_i  is  orthogonal  to  T7-u,n-2,  we 
have  that  for  |to|  <  n,  w  x  0 


E  [Ytf^-i^n-i]  —  E[Et<n- 1 
=  E  [Et,n-i 


(0)/7-l<,n-l] 


(3.67) 


where  the  second  equality  follows  from  Lemma  3.2.  Summing  over  |to|  <  n  and  to  x  0 
and  using  (3.23)  then  yields  the  first  equality  in  (3.63).  The  second  follows  in  a  similar 
fashion  (see  also  (3.25)).  Similarly,  since  et%n-i  is  also  orthogonal  to  3^7-ii,ra-2>  we  have 
that 

E[Ytet<n- 1]  =  E  [Etin-i(tyzt,n-i]  =  E  [Et,n(w)et,n- i]  (3.68) 

The  last  equality  here  follows  from  the  structure  of 


E  [-fi'i.n— l(^)^t,n— 1. 


[0,  .  .  .  ,  1,  0  •  •  •  0]Sjgjn_ 


1 

1 


eigenvalue  associated  with  [1, . . . ,  1]T 


(3.69) 


(here  the  single  1  in  the  row  vector  in  (3.69)  is  in  the  toth  position.)  Summing  over 
to  and  using  (3.23)  yields  (3.64).  Equations  (3.65)  and  (3.66)  follow  in  an  analogous 
manner. 
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It  is  now  possible  to  write  the  desired  recursions  for  kn.  Specifically  if  we  multiply 
(3.47),  (3.48),  (3.50),  (3.51)  by  Yt  and  take  expectations  we  obtain  recursions  for  the 
quantities  needed  in  the  right-hand  sides  of  (3.63)-(3.66).  Furthermore,  from  (3.49) 
and  (3.52)  we  see  that  kn  is  directly  computable  from  the  left-hand  sides  of  (3.63)- 
(3.66).  In  order  to  put  the  recursions  in  the  most  compact  and  revealing  form  it  is 
useful  to  use  formal  power  series.  Specifically  for  n  >  0  define  Pn  and  Qn  as: 

Pn  =  co  y(Yt,et,n)tJ2E(Ytewt>n)-w  (3.70) 


Qn  =  COV  (Yt,  /*,„)  =J2E  (Ytfwt,n)  ■  W  (3.71) 

where  we  begin  with  P0  and  Q0  specified  in  terms  of  the  correlation  function  rn  of  Yt: 

Po  =  Qo  =  £  rM  •  w  (3.72) 

Recalling  the  definitions  (2.24),  (2.25)  of  7 S  and  S^S  for  S  a  formal  power  series 
and  letting  5(0)  denote  the  coefficient  of  w  —  0,  we  have  the  following  generalization 
of  the  Schur  recursions  : 

Proposition:  The  following  formal  power  series  recursions  yield  the  sequence  of 
reflection  coefficients. 

For  n  even 


where 


For  n  odd 


Pn— 1  ^n'lfQn—l 

(3.73) 

=  \  +  6(i)Pn- 1)  -  KPn- 1 

(3.74) 

n  2Pn_i(0) 

(3.75) 

=  |  (Pn- 1  +  ^(2?1)Pn-l)  “  &n7<2n-l 

(3.76) 

=  iQn-X-k^iPn-l+^Pn-l) 

(3.77) 

36 

Qn 


where 

k  2TQ— ,(0) 

Note  that  for  n  =  1,  (3.76)-(3.77)  do  agree  with  (3.1 7)— (3.19)  since  P0 
7<3o(0)  =  n  and  Po(0)  =  r0. 
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(3.78) 

6(0)Po, 


4  Vector  Levinson  Recursions  and  Modeling  and 
Whitening  Filters 

In  this  section  we  return  to  the  vector  prediction  errors  Et>n,  FtyXl  in  order  to  develop 
whitening  and  modeling  filters  for  Yt.  As  we  will  see,  in  order  to  produce  true  whiten¬ 
ing  filters,  it  will  be  necessary  to  perform  a  further  normalization  of  the  innovations. 
However,  the  formulas  for  Eitn  and  Ft>n  are  simpler  and  are  sufficient  for  us  to  study 
the  question  of  stability.  Consequently  we  begin  with  them. 


4.1  Filters  Involving  the  Unnormalized  Residuals 


To  begin,  let  us  introduce  a  variation  on  notation  used  to  describe  the  structure  of 
E E,n-  In  particular  we  let  1*  denote  a  unit  vector  all  of  whose  components  are  the 


same: 


We  also  define  the  matrix 


1*  = 


1 


\/diml 

u*  =  ur 


(4.1) 

(4.2) 


which  has  a  single  nonzero  eigenvalue  of  1.  Equations  (4.1),  (4.2)  define  a  family 
of  vectors  and  matrices  of  different  dimensions.  The  dimension  used  in  any  of  the 
expressions  to  follow  is  that  required  for  the  expression  to  make  sense.  We  also  note 
the  following  identities: 


u*  u* 

=  u* 

(4.3) 

r 

(4.4) 

if  =  ur 

=  U*F 

(4.5) 

where  F  =  {F(w)}  is  a  vector  indexed  by  certain  words  w  ordered  as  we  have 
described  previously,  where  /  is  its  barycenter,  and  where  /*  is  a  normalized  version 
of  its  barycenter. 

The  results  of  the  preceding  section  lead  directly  to  the  following  recursions  for 
the  prediction  error  vectors: 
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Theorem  4.1  The  prediction  error  vectors  Et>n  and  Ft<n  satisfy  the  following  recur¬ 
sions,  where  the  kn  are  the  reflection  coefficients  for  the  process  Yt: 

For  n  even: 


F t,n  —  ^t,n— 1  kn  U*  -Fy-i^n— 1 


(4.6) 


F t,n 


For  n  odd,  n  >  i: 


Et,n  = 


EiWt,  n-1 
F-f~^t,n— 1 


Et,n- 1 

I  E.n-l. 

L  S(T— >t,n— 1 


u* 

u* 


Et,n— 1 


kfl  U*  F, y— l(jn— 1 


Ft 


t,n 


F~t~lt,n—1  kn  U* 


Et,n— 1 
E  t  n—I 

while  for  n  =  1  we  have  the  expressions  (3.17)-(3.19). 


(4.7) 


(4.8) 


(4.9) 


Proof:  As  indicated  previously,  this  result  is  a  direct  consequence  of  the  analysis 
in  Section  3.  For  example,  from  (3.16),  Lemma  3.1  (3.28),  and  (4.5)  we  have  the 
following  chain  of  equalities  for  n  even: 

Ft,n  —  Et,n-l  —  E  (Et,n-l\F*y-it,n-l) 

Ft,n— 1  ^jn— 1 

=  Et,n- 1  —  A  U*  -Fy-lj.n-l  (4-10) 

where  A  is  a  constant  to  be  determined.  If  we  premultiply  this  equality  by  (dim  Et,n-i)  lr, 
we  obtain  the  formula  for  the  barycenter  of  Et,n-\,  and  from  (3.47)  we  see  that  A  =  kn. 
The  other  formulae  are  obtained  in  an  analogous  fashion. 

The  form  of  these  whitening  filters  deserves  some  comment.  Note  first  that  the 
stages  of  the  filter  are  of  growing  dimension,  reflecting  the  growing  dimension  of  the 
Et^n  and  Ft<n  as  n  increases.  Nevertheless  each  stage  is  characterized  by  a  single 
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reflection  coefficient.  Thus,  while  the  dimension  of  the  innovations  vector  of  order  n 
is  on  the  order  of  2?,  only  n  coefficients  are  needed  to  specify  the  whitening  filter  for 
its  generation.  This,  of  course,  is  a  direct  consequence  of  the  constraint  of  isotropy 
and  the  richness  of  the  group  of  isometies  of  the  tree. 

In  Section  3.4  we  obtained  recursions  (3.58),  (3.59),  (3.61)  for  the  variances  of  the 
bary centers  of  the  prediction  vectors.  Theorem  4.1  provides  us  with  the  recursions  for 
the  covariances  and  correlations  for  the  entire  prediction  error  vectors.  We  summarize 
these  and  other  facts  about  these  covariances  in  the  following. 

Corollary:  Let  S E,m  ^F,n  denote  the  covariances  Et<n  and  Ft^n,  respectively.  Then 

1.  For  n  even 


(a)  The  eigenvalue  of  £ E,n  associated  with  the  eigenvector  [1, . . . ,  1]  is 

VE,n  =  2?-Ve2>n  (4.11) 

where  <r2„  is  the  variance  of  et,n- 

(b)  The  eigenvalue  of  Hf,u  associated  with  the  eigenvector  [1, . . . ,  1]  is 

«,»  =  2“4„  (4.12) 

where  (rj  n  is  the  variance  of  ft,n- 

2.  For  n  odd, 

S E,n  =  %F,n  =  £„  (4-13) 

and  the  eigenvalue  associated  with  the  eigenvector  [1, . . . ,  1]  is 

Vn  =  VE,n  =  HF,n  =  2  (4.14) 

where  a\  is  the  variance  of  both  et>n  and 

3.  For  n  even 


£„  =  =  cov 


(  Et,n  )  = 

s E,n 

xnu' 

\  ^S9t,n  / 

_KU 

^ E,n  . 

(4.15) 
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where  U  =  11T,  and 


4.  For  n  odd,  n  >  1 


5.  For  n—  1 


SE,„  =  E„_,  - 


1  ^n—  lU  1  /  2  2 


\n-\U  Sfi.n-l 


Si  =  (l  -  fcj)  r0 


(4.16) 

(4.17) 


(4.18) 


(4.19) 


Proof:  Equations  (4.11),  (4.12),  and  (4.14)  follow  directly  from  the  definition  of  the 
barycenter.  For  example,  for  n  even 


2(^  1et<n  =  1  TEt,n 


(4.20) 


from  which  (4.11)  follows  immediately.  Equation  (4.13)  is  a  consequence  of  Lemma 
3.1.  To  verify  (4.15)  let  us  first  evaluate  (4.6)  at  both  t  and  S^H: 


-f'y~1qn.-l 


(4.21) 


The  first  equality  in  (4.15)  is  then  a  direct  consequence  of  Lemma  3.1  (compare  (4.7) 
and  (4.21)).  The  form  given  in  the  right-most  expression  in  (4.15)  is  also  immediate: 
the  equality  of  the  diagonal  blocks  is  due  to  isotropy,  while  the  form  of  the  off-diagonal 
blocks  again  follows  from  Lemma  3.1.  The  specific  expression  for  E £,n  in  (4-16)  follows 
directly  from  the  second  equality  in  (4.10),  while  (4.17)  follows  from  (4.21)  and  the 
fact  that 

E  [V.WWWI  =  *»<£-!  (4-22) 

which  in  turn  follows  from  Lemma  3.1  and  (3.49).  Finally,  (4.18)  follows  from  (4.15) 
and  (4.8),  and  (4.19)  is  immediate  from  (3.1 7)— (3.19). 
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Just  as  with  time  series,  the  whitening  filter  specification  leads  directly  to  a  mod¬ 
eling  filter  for  Yt. 

Corollary:  The  modeling  filter  for  Yt  is  given  by  the  following.  For  n  even 

(  Et,n  ^ 


Et,n- 1 


t,n 


=  s  (kn) 


E 


V  E-f-it,n- 1  ) 


(4.23) 


where 


£(*»)  = 


/  o  fc„u, 

-fcnU*  I  (kn  -  k2n)  U* 
-K U*  0  (I  - 


(4.24) 


For  n  odd,  n  >  1: 


/  F 

■C't.n-l 

\  Et,n 


E  j  n-l  . 

(r  2  't,n— 1 


S(Jfcn) 


& 


t,n 


I 


(4.25) 


where 


while  for  n  =  1 


E(*»)  = 


I  kn  U* 

L-fcnU»  (/-A£u.) 


(4.26) 


(4.27) 


Et,o  j  _  |  1  k\  |  /  Eti  i 

*i,i  J  ~~  V  -h  1  -  k\  ){  F^-Hi0 

These  equations  can  be  verified  by  solving  (4.6)-(4.9)  and  (3.17)— (3.19)  to  obtain 
expressions  for  E’s  of  order  n-l  and  F’ s  of  order  n  in  terms  of  E's  of  order  n  and 
F's  of  order  n  —  1.  Thus,  as  in  the  case  of  lattice  filters  for  time  series  we  have  a 
scattering  layer-like  structure  for  the  generation  of  Yt  =  Et,o- 

Since  EtjN  consists  of  prediction  errors  for  Ywt,  |to|  <  TV,  w  x  0,  the  input- 
output  map  for  the  modeling  filter  is  a  map  from  the  2^  2  i -dimensional  input  Et^ 
to  the  2 dimensional  vector  of  outputs  {Ywt\w  <  N,  w  x  0}.  Figure  4.1  pro¬ 
vides  a  picture  of  the  structure  of  this  multivariable  scattering  system.  Here  S(fci) 
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is  the  matrix  on  the  right-hand  side  of  (4.27),  and  for  n  odd  £(&„)  is  given  by 
(4.26).  For  n  even  £(fcn)  is  a  modified  version  of  Y(kn)  as  it  produces  both 
and  Es(%)tn_x  (essentially  (4.21)  with  these  two  prediction  vectors  viewed  as  outputs 
and  Ettn,Eg(%)tn  viewed  as  inputs).  Also  £(&„)  has  both  i?7-itjn_1  and  f 
as  inputs.  Note  that  the  inputs  to  this  block  are  not  all  linearly  independent  and 

thus  there  are  a  number  of  ways  to  write  £(&„)  (essentially  due  to  the  fact  that 

Ft,n  =  FftVt'J-  Ordering  the  inputs  as  (£*,„>  EsWt,n'  F7-i and  the 
outputs  as  (Etin-i,  Eg^)t  n_v  Fttn)  one  choice  is 

I  0  knU*  0 

0  /  0  knU * 

-knU*  I  (K-K)U*  o 

-knU*  0  I  -  k2nU*  0 

where  all  the  blocks  are  of  dimension  2^~1  (note  that  this  form  emphasizes  the 
redundancy  in  the  input:  given  E^E^^F^ all  we  need  is  /7_ 

Finally,  based  on  this  corollary  we  can  now  state  the  following  stability  result: 

Theorem  4.2  The  conditions 

—  \<kn<l  n  odd  1  <  n  <  N  (4.28) 

<k2n  <1  n  even  1  <n  <  N  (4.29) 

are  necessary  and  sufficient  for  the  Nth-order  modeling  filter  specified  by  reflection 
coefficients  {&n|l  <  n  <  N}  to  be  stable,  so  that  a  bounded  input  Et^  yields  a  bounded 
output  Yt  =  Etfi. 

Proof:  This  is  a  variation  on  the  proof  of  stabilty  for  systems  described  by  cascades 
of  scattering  sections,  complicated  here  by  the  growing  dimensionality  of  the  E' s  and 
F' s.  Let  us  consider  in  detail  the  scattering  diagram  illustrated  in  Figure  4.1.  Thanks 
to  the  fact  that  the  forward  transmission  matrices  are  identity  matrices  and  the  com¬ 
mon  eigenstructure  of  all  matrices  involved,  a  necessary  and  sufficient  condition  for 
stability  is  that  all  reflection  coefficient  matrices  have  eigenvalues  of  magnitude  less 
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than  1  and  all  reverse  transmission  matrices  have  eigenvalues  with  magnitudes  less 
than  or  equal  to  1.  For  n  odd,  the  transmission  matrices  I  and  I  —  k2U*  have  eigen¬ 
values  of  1  and  1  —kl,  while  the  reflection  matrices  ±knUm,  have  nonzero  eigenvalues 
of  ±kn.  From  these  we  can  deduce  (4.28).  For  n  even,  the  eigenvalues  of 

/  knU *  0  \ 

v  o  knU»  ) 

are  kn  and  0,  yielding  the  constraint  \kn\  <  1.  However  the  tighter  constraint  comes 
from  the  other  reflection  coefficient  matrix: 

{  KU .  I  \ 

\  KU *  0  ) 

The  two  nonzero  eigenvalues  of  this  matrix  are  the  roots  of  the  equation 

A2  +  kn^  +  kn  =  0 

It  is  easily  checked  that  these  roots  have  magnitude  less  than  one  if  and  only  if  (4.29) 
is  satisfied.  Similarly  the  transmission  matrix 

'  {kn-k2n)U*  O' 

I  -  knU*  0  _ 

has  nonzero  eigenvalue  ( kn  —  k £),  which  is  between  ±1  for  kn  satisfying  (4.29),  com¬ 
pleting  the  proof. 

4.2  Levinson  Recursions  for  the  Normalized  Residuals 

The  prediction  errors  Et,n  and  FttJl  do  not  quite  define  isotropic  processes.  In  partic¬ 
ular  the  components  of  these  vectors  representing  prediction  error  vectors  at  a  set  of 
nodes  are  correlated.  Furthermore  for  n  even  we  have  seen  that  Et>n  and  Eg( %)t  n_1  are 
correlated  (see  (4.15)).  These  observations  provide  the  motivation  for  the  normalized 
recursions  developed  in  this  section.  In  this  development  we  use  the  superscript  *  to 
denote  normalized  versions  of  random  vectors.  Specifically  X*  =  where 

is  the  covariance  of  X  and  is  it  symmetric,  positive  definite  square  root. 

We  now  can  state  and  prove  the  following: 
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Theorem  4.3  The  following  are  the  recursions  for  the  normalized  residuals. 


For  n  even 


.*• ) 

=  e(Jfen) 

'S^h,n  ) 

F* 

r  t,n 

=  0(fc») 

ip* 

^t,n- 1 

V  Eh\n-1  ) 


( 


J7»* 

“sWt.n- 1 


-  k„ 


-  K 


u* 

u* 

u* 

u» 


L\  / 

where  0-1(A:„)  is  the  matrix  square  root  satisfying 

(I-  kill.  (K  -  *J)U.  \ 


F*  , 
1 


Kn-l 


e-\K)  = 


\  (kn  -  kl) U*  I  -  klu *  ) 


For  n  odd,  n  >  l 


Kn 


TP* 
r  t,n 


e(fc») 


e(fc«) 


Et,n—  1 

\  E  t  n— l  s 


K-'t.n-l  kji  U* 


A:n  U*  K~xt,n-1 


Et,n-\ 

\  ES^\n- 1  / 


where 


for  n  =  i 


0-r(fen)0-1(A;n)  =  /-^u, 


F*  — 


(4.30) 


(4.31) 


(4.32) 


(4.33) 

(4.34) 

(4.35) 

(4.36) 

(4.37) 


Remark:  Note  that  for  n  even  we  normalize  Et,n  and  E&f  %)t  n  together  as  one  vector, 
while  for  n  odd,  EttU  is  normalized  individually.  This  is  consistent  with  the  nature 
of  their  statistics  as  described  in  (4.15)-(4.19)  and  with  the  fact  that  for  n  even 
dimFt>n  =  2dimFf(n,  while  for  n  odd  dimF4,n  =  dimFtjn. 
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Proof:  This  result  is  a  relatively  straightforward  computation  given  (4.1 1)— (4. 19). 
For  n  even  we  begin  with  (4.7)  and  (4.21)  and  premultiply  each  by 

diag  (  ,  K'J?  ) 

Since  1*  is  an  eigenvector  of  £n_i,  £n-i  and  therefore  £„_i  commute  with  17*.  This 
immediately  yields  (4.30)  and  (4.31)  where  the  matrix  Q(kn)  is  simply  the  inverse  of 
the  sqauare  root  of  the  covariance  of  the  term  in  brackets  in  (4.30)  and  in  (4.31)  (the 
equality  of  these  covariances  follows  from  (4.15)).  Equation  (4.32)  then  follows  from 
(4.11)  and  (4.15).  The  case  of  n  odd  involves  an  analogous  set  of  steps,  and  the  n  =  1 
case  is  immediate. 


The  preceding  result  provides  us  with  a  recursive  procedure  for  calculating  £n  1^2 
(see  Appendix  D  for  an  alternate  efficient  procedure).  For  n  even  we  have 


e;’/2  =  @(h)  diag  (  E;y,2  ,  e~K2  ) 

(4.38) 

while  for  n  odd,  n  >  1 

s;1/2  = 

(4.39) 

and  for  n  =  1 

E”1/2  =  [(1  -  &2)r0]  1/2 

(4.40) 

The  calculation  of  0(fcn)  is  also  obtained  in  a  straightforward  way  using  the  following 
two  formulae.  For  any  k  >  —  1 

(I+HJ.)-i=/+f-^L=-liu,  (4.41) 


and  for  S  and  T  symmetric 

f5  =UX  +  Y  X~Y\  (4.42) 

\t  s  )  2  \x-y  x  +  r  ) 

where 

X  =  (S  +  T)~1/2 

Y  =  (S-T)~1/2  (4.43) 
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Using  (4.42),  (4.43)  we  see  from  (4.32)  that  to  calculate  ®{kn)  for  n  even  we  must 
calculate 

(i  +  (X  -  2k2n)  LU)'1/2 

and 

(i-knu>yi/2 


which  can  be  done  using  (4.41)  as  long  as  —1/2  <  kn  <  1.  As  mentioned  previously 
and  as  discussed  in  Section  5,  kn  =  —1/2  or  1  corresponds  to  the  case  of  a  singular 
process  and  perfect  prediction.  For  n  odd,  from  (4.35)  we  see  that  we  must  calculate 


which  exists  and  can  be  calculated  using  (4.41)  as  long  as  kn  ±1  i.e.  as  long  as  we 
are  not  dealing  with  a  singular  process  for  which  perfect  prediction  is  possible. 

Now  that  we  have  a  normalized  form  for  the  residual  vectors,  we  can  also  describe 
the  normalized  version  of  the  modeling  filters  which  provde  the  basis  for  generating 
isotropic  Y^s  specified  by  a  finite  number  of  reflection  coefficients  and  driven  by  white 
noise 


Theorem  4.4  For  n  even  we  have 


where 


with 


TP* 

1 

’  {Et,n  y 

=  £(*„) 

\  ^fh,n  / 

TP* 
r t,n 

77»* 

(  /  +  a(&n)U* 


E(*»)  = 


d(kn)  U, 


b(kn)  U, 

I  +  c(kn)  U* 

_ ia.1  I 

2  U* 


*»u,  N 

b{kn)  U* 

/  +  a(&n)U*  ) 


a(k) 


yUA.:.  vr— f  _i 


(4.44) 


(4.45) 


(4.46) 
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WO  = 

(4.47) 

c(*0  = 

y/lT2k-(l  +  k) 

2 

(4.48) 

d(k )  =  — c(k )  —  k 


(4.49) 


The  matrix  T,(kn)  is  referred  to  as  the  scattering  matrix ,  and  it  satisfies 


£  (&n)  Er  (kn)  =  I 


For  n  odd,  n  =  1 


Ft,n—1 

E  (  n— 1  %  I 
S'-  >t,n— 1  / 

J?* 

^ t,n 


=  £(*») 


Z7* 

t,n 

Jp* 

L  1 


where  the  scattering  matrix 


(  (1-  kl u.),/2 


\ 


^nU* 


(/  -  SrJU.)1'2 


\ 

/ 


satisfies 

H(kn)HT{kn)  =  I 


For  n  =  1 : 


and 


also  satisfies 


=  S(*i) 


S(fe1)Er(^1)  =  7 


(4.50) 

(4.51) 

(4.52) 

(4.53) 

(4.54) 

(4.55) 

(4.56) 
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Proof:  We  begin  by  solving  (4.30)  for  \Ejn_x  n_x  J  then  by  subsituting  this 

into  (4.31)  we  obtain 


/  Et,n-1  \ 

Z7»* 

■^t^n 

V  ES^h,n- 1  / 

=  Hkn) 

Ehh,n 

I 71* 

^ <,n 

TP* 

m  1  . 

where 


Q-'iK) 


t(kn)  = 


Q(kn) 


-fc„u*  I 

- knU *  0 


0-X(^n) 


e(*») 


(*«  -  kl) U.  \ 
I -kill*  ) 


(4.57) 


(4.58) 


To  obtain  the  desired  relation,  we  simply  drop  the  calculation  of  E*^yt  n  l  from  (4.57). 
To  do  this  explicitly  we  consider  E(kn)  as  a  matrix  with  three  block-columns  and  four 
block-rows  (one  each  for  E*n_x  and  E*^)t  n_x  and  two  for  F*n).  Thus  what  we  wish  to 
do  is  to  drop  the  second  block-row.  A  careful  calculation  using  the  relations  derived 
previously  yields  (4.45)-(4.49).  That  T,(kn)  satisfies  (4.50)  follows  immediately  from 
the  fact  that  the  vectors  on  both  sides  of  (4.44)  have  identity  covariances.  The  result 
for  n  odd,  n  >  1  is  obtained  in  a  similar  fashion,  and  the  case  of  n  =  1  is  immediate. 
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5  Characterization  of  Autoregressive  and  Regular 
Processes 

The  analysis  in  the  preceding  sections  allows  us  to  deduce  a  number  of  properties 
of  the  class  of  autoregressive  isotropic  processes.  The  first  result  summarizes  some 
immediate  consequences  which  we  state  without  proof: 

Proposition  5.1  IfYt  is  an  AR(p)  isotropic  process,  then 

1.  The  reflection  coefficents  kn  =  0  for  n  >  p  +  1,  and  the  forward  and  back¬ 
ward  normalized  prediction  error  vectors  E(p  and  F*p  are  standard  white  noise 
processes  (i.e.  with  unity  covariance). 

2.  Let  us  write  the  formal  power  series  Pp  defined  in  (3.70)  as 

pp=  I Zp*>'w  (s.i) 

U>€C 

i£/^0 

If  p  =  0,  pw  =  0  if  w  ^  0.  If  p  =  1,  pw  =  0  unless  w  =  'y~k  for  some  k  >  0.  If 
p  >  2,  then  pw  =  0  for  all  words  of  the  form  w  =  wa^Sj~k  with 

wafie{a,PY  and  \wa^\  >  |  -1  (5.2) 

In  other  words,  Pp  has  its  support  in  a  cylinder  of  radius  around  the  path 
{7~fc}  toward  —  oo.  From  this  we  also  have  that  the  modeling  filter  of  an  AR(p) 
process  has  its  support  in  the  same  cylinder  of  radius  around  [t,  — oo).  Con¬ 

versely,  any  process  such  that  the  modeling  filter  has  its  support  contained  in 
the  cylinder  of  radius  |J  is  necessarily  an  AR(p)  process. 

Figure  5.1  illustrates  the  cylinder  for  an  AR(2)  process.  Note  that  (1)  is  a  gener¬ 
alization  of  the  result  in  Appendix  A  that  stated  that  if  an  isotropic  process  has  its 
support  concentrated  on  [t,  —  oo)  it  is  necessarily  AR(1). 

Our  analysis  to  this  point  has  shown  how  to  construct  a  sequence  of  reflection 
coefficients  {kn}  from  an  isotropic  covariance  sequence  {rn}.  Furthermore  we  have 
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seen  that  the  {Arn}’s  have  particular  bounds  and  that  if  {rn}  comes  from  an  AR(p) 
process,  only  the  first  p  of  the  reflection  coefficients  are  nonzero.  The  following 
result  states  that  the  converse  holds,  i.e.  that  any  finite  kn  sequence  satisfying  the 
required  constraints  corresponds  to  a  unique  AR  covariance  sequence.  This  result 
substantiates  our  previous  statement  that  the  reflection  coefficients  provide  a  good 
parameterization  of  AR  processes. 

Theorem  5.1  Given  a  finite  sequence  of  reflection  coefficients  kn,  1  <  n  <  p  such 
that 

{-§  <  kn  <  1  for  n  even  ^  ^ 

-1  <  kn  <  1  for  n  odd 

there  exists  a  unique  isotropic  covariance  sequence  which  has  as  its  reflection  coeffi¬ 
cient  sequence  the  given  kn  followed  by  all  zeroes. 

The  proof  of  this  theorem  rests  on  the  following  which  is  obtained  immediately 
from  the  Schur  recursions: 


Lemma  5.1  Consider  the  transformation  $  which  maps  an  isotropic  covariance  se¬ 
quence  {r„}  to  the  corresponding  reflection  coefficient  sequence.  The  Jacobian  of  this 
transformation  satisfies  the  following: 


dkn 
drm 
dk2n 
dr2n 
@k2n+i 
dr2n+ 1 


=  0  for  n  <  m 

2n-1_P2n-1(0)  ^  0 
1 

2n-l  (P2n(0)  -f  6(n)P2n(0))  ' 


(5.4) 

(5.5) 

(5.6) 


where  the  Pn  are  the  Schur  series  defined  in  (3.70) 


Proof  of  Theorem  5.1:  Consider  the  modeling  filter  of  order  p  specified  by  the 
given  set  of  reflection  coefficients.  What  we  must  show  is  that  the  output  of  this 
filter,  yt,  is  well  defined  (i.e.  has  finite  covariance)  and  isotropic  when  the  input  is  a 
standard  white  noise  process.  That  it  is  well-defined  follows  from  the  stabililty  result 
in  Theorem  4.2.  Thus  we  need  only  show  that  yt  is  isotropic.  More  specifically,  let 
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(s,t)  and  («',£')  be  any  two  pairs  of  points  such  that  d(s,t )  =  The  theorem 

will  be  proved  if  we  can  show  that  the  function 

$  :  K  =  (kn) i<„<p  — ♦  E(ytys)  -  E(yt>ys')  (5.7) 

is  identically  zero. 

The  form  of  the  modeling  filter  shows  that  $  is  a  rational  function  of  K.  Thus  it 
is  sufficient  for  us  to  show  that  $  is  zero  on  a  nonempty  set  in  TZP.  Since  we  know 
that  $(fif)  =  0  if  K  is  obtained  via  the  transformation  it  is  sufficient  for  us  to 
show  that  the  set  of  K  obtained  via  the  transformation  $  has  a  nonempty  interior. 

Thanks  to  the  form  of  the  Schur  recursions  we  know  that  is  also  a  rational 
function  and,  thanks  to  Lemma  5.1,  its  Jacobian  is  triangular  and  always  invertible. 
Thus  it  is  sufficient  to  show  that  the  set  of  finite  sequences  {rn|0  <  n  <  p}  that  can 
be  extended  to  a  covariance  function  of  an  isotropic  process  has  a  nonempty  interior. 
However,  this  property  is  characterized  by  a  finite  family  of  conditions  of  the  form 

U(r0,...,rp)  >  0  (5.8) 

where  7l(r0, ...,  rp)  denotes  a  matrix  whose  elements  are  chosen  from  the  r0, . . . ,  rp. 
The  set  of  (p+  l)-tuples  satisfying  these  conditions  with  strict  inequality  is  nonempty 
(e.g.  rn  —  8n0)  and  as  a  consequence  the  set  of  ro, . . . ,  rp  satisfying  (5.8)  has  a 
nonempty  interior. 

Finally,  the  machinery  we  have  developed  allows  us  to  characterize  the  class  of 
regular  processes. 

Definition  5.1  An  isotropic  process  Yt  is  regular  or  purely  nondeterministic  if  no 
nonzero  linear  combination  of  the  values  of  yt  on  any  given  horocycle  can  be  predicted 
exactly  with  the  aid  of  knowledge  of  yt  in  the  strict  past. 

With  the  aid  of  a  martingale  argument,  Yt  is  regular  if  and  only  if 

lirninf  (£/?,„)  >  0  (5.9) 
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where  Ajnf(y4)  denotes  the  minimum  eigenvalue  of  A.  Given  the  form  of  En  in  (4.13), 
(4.15),  and  (4.18),  we  can  deduce  that  this  is  equivalent  to 


liminf  Ainf  (En)  >  0 

n— kx) 


(5.10) 


Thanks  to  the  structure  of  £n  determined  from  (4.38)-(4.40)  and  the  definition  of 
e-'iK)  in  (4.32),  (4.35),  we  can  deduce  that 


and  for  i  odd,  i  >  1 


while  for  i  even 


Ainf  (£„)  =  r0(  1  -  k\)  n  Ainf  (0"2(^)) 

i= 2 

Am  (e-a(*.-))  =  1  -  kj 


(5.11) 


(5.12) 


a  M(e-!(fc))  = 


min  (l  —  ki,l  +  ki  —  2 
1  —  |&,-|  for  ki  small 


(5.13) 


From  this  we  can  deduce  the  following: 


Theorem  5.2  An  isotropic  process  Yt  is  regular  if  and  only  if  its  reflection  coefficient 
sequence  is  such  that 

OO 

(^2n-l  4"  |^2n|)  <  OO  (5-14) 

n=l 
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6  Conclusion 


In  this  paper  we  have  described  a  new  framework  for  modeling  and  analyzing  signals  at 
multiple  scales.  Motivated  by  the  structure  of  the  computations  involved  in  the  theory 
of  multiscale  signal  representations  and  wavelet  transforms,  we  have  examined  the 
class  of  isotropic  processes  on  a  homogenous  tree  of  order  2.  Thanks  to  the  geometry 
of  this  tree,  an  isotropic  process  possesses  many  symmetries  and  constraints.  These 
make  the  class  of  isotropic  autoregressive  processes  somewhat  difficult  to  describe  if  we 
look  only  at  the  usual  AR  coefficient  representation.  However,  as  we  have  developed, 
the  generalization  of  lattice  structures  provides  a  much  better  parametrization  of  AR 
processes  in  terms  of  a  sequence  of  reflection  coefficients. 

In  developing  this  theory  we  have  seen  that  it  is  necessary  to  consider  forward  and 
backward  prediction  errors  of  dimension  that  grows  geometrically  with  filter  order. 
Nevertheless,  thanks  to  isotropy,  only  one  reflection  coefficient  is  required  for  each 
stage  of  the  whitening  and  modeling  filters  for  an  isotropic  process.  Indeed  isotropy 
allowed  us  to  develop  a  generalization  of  the  Levinson  and  Schur  scalar  recursions  for 
the  local  averages  or  barycenters  of  the  prediction  errors,  which  also  yield  the  reflec¬ 
tion  coefficients.  Finally  we  have  justified  our  claim  that  the  reflection  coeffients  are 
a  good  paramtrization  for  AR  processes  and  isotropic  processes  in  general  by  showing 
that  AR  processes  can  be  uniquely  specified  by  these  coefficients  and  the  regular¬ 
ity  of  an  isotropic  process  can  be  characterized  in  terms  of  its  reflection  coefficient 
sequences. 

It  is  our  belief  that  the  theory  developed  in  this  paper  provides  an  extremely  useful 
framework  for  the  development  of  multiscale  statistical  signal  processing  algorithms. 
In  particular  we  expect  this  framework  and  its  multidimensional  counterparts  to  be 
useful  in  analyzing  signals  displaying  fractal- like  or  self-similar  characteristics,  i.e. 
random  signals  whose  behavior  is  similar  at  multiple  scales.  Figure  6.1  illustrates  a 
sample  of  an  AR(1)  process  with  k1  =  0.99  which  displays  this  self-similar  behavior. 
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Appendices 


A  AR(1)  and  isotropic  processes  with  strict  past 
dependence 

We  wish  to  show  that  AR(1)  processes  are  the  only  isotropic  processes  with  strict 
past  dependence.  To  do  this  let  us  introduce  the  notation  ]  —  oo,  t]  to  denote  the  path 
from  t  back  towards  — oo,  i.e.  the  set  {7~nf|n  >0},  and  consider  a  process  of  the 
form 

Yt=  J2  ad(t,s)Ws  (A.l) 

s€]-oo,<] 

where  Wt  in  unit  variance  white  noise. 

We  now  consider  the  conditions  under  which  (A.l)  is  stationary.  Let  ty  and  t2  be 
any  two  nodes,  let  t  —  ty  A  t2,  and  define  the  distances  ny  =  d(ty,t),  n2  =  d(t2,t). 
Note  that  d(ty,t2)  =  ny  +  n2.  Also  let  r(ty,t2)  =  E(YtlYt2).  Then  from  (A.l),  the  fact 
that  Wt  is  white,  and  the  definition  of  t,  riy,  and  n2,  we  have 

r(^l>^2)  =  J3  ad(ti,s1)ad(t2,S2)E  {WS1WS2) 

»ie]-oo,u]  s2e]-oo,t2] 

=  5Z  ad{t\,s)ad(t2,s) 

5g]-oo,t] 

flni+mfln2  +m 

m>  0 

For  Yt  to  be  isotropic  we  must  have  that 

r{ty,t2)  =  r(d(ty,t2)) 

=  r  (ny  +  n2) 

Therefore  for  ny  >  0,  n2  >  0  we  must  have  that 

r  (ny  +  n2)  =  E 

m>  0 


(A.2) 
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In  particular  for  n  >  2  we  can  deduce  from  (A.2)  that  we  have  the  following  two 
relationships 


r(2n)  =  r(n  +  n) 

=  XX+m 

m>0 

=  r(2n  -  2)  -  aj.t 


(A.3) 


r(2n)  =  r  ((n  +  1)  +  (n  —  1)) 

m>0 

=  r(2n  -  2)  -  an_2a„  (A.4) 

from  which  we  deduce  that 


2  —  ® 


2 

n—l 


n  >  2 


or  equivalently 

-Sa-  =  constant  ,  n  >  1 

Thus  an  =  cran,  so  that 


Yt=  J2  <™d(t,s)Ws 

s€]-oo,i] 


from  which  we  immediately  see  that  Yt  satisfies 

Yt  =  aY,.it  +  aWt. 
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B  The  Relation  Among  the  Parameters  of  AR(2) 


Consider  the  second-order  model  (2.45)  where  Wt  is  unit  variance  white  noise.  We 
would  like  to  show  that  the  coefficients  ax,  a2,  and  a3  are  related  by  a  fourth-order 
poylnomial  relation  that  must  be  satisfied  if  Yt  is  isotropic.  To  begin  note  that  from 
(2.45)  we  obtain  the  relation 

E(YtWt)  =  a3E(YStWt)  +  o  (B.l) 


while  from  (2.46)  we  find 


E(YStWt)  =  a3E(YtWt)  (B.2) 

from  which  we  deduce  that  |a3|  ^  1  and 

E(YtWt)  =  O 

al-al 

E(YstW,)  =  (B.3) 

a0  —  a3 

Next  consider  multiplying  (2.45)  by  each  of  the  following:  Yt,  Y$t,  Yy-it,  Yy-2t.  We 
take  expectations  using  (B.l),  (B.2)  and  the  fact  that  E(Yy-itWt )  =  E(Yy-2tWt )  =  0 
(since  we  are  solving  the  AR  equations  “casually” — see  (2.47),  (2.48)).  Assuming  that 
Y  is  isotropic,  we  obtain  the  following  four  linear  equations  in  the  three  unknowns 
ro,  rx,  r2: 

' 

r0  = 
r\  = 

i 

r2  = 
r2  = 

For  this  system  to  have  a  solution  the  coefficients  ai,  a2,  a3  must  satisfy 


+  a2r2  +  a3r2  +  o^r 
ai^o  +  «2  r\  +  a3r\ 
dir  i  +  a2r0  +  a3r2 
air  i  +  a2r2  +  a3r0  + 


(B.4) 


-1 

ax 

a2  +  a3 

-1 

ai 

a2  a 3  —  1 

0 

0 

a2 

ax 

03-1 

0 

«3 

ax 

a2  —  1 

~a3 

(B.5) 
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which  is  a  fourth-order  polynomial  relation.  It  is  straightforward  to  check  that  these 
are  the  only  constraints  on  the  at-  in  order  for  Y  to  be  isotropic  (multiply  (2.45)  by 
any  Ywt,  w  ^  0,  |u?|  >  2  and  take  expectations— one  obtains  a  unique  expression  for 
each  rn,  n  >  3  in  terms  of  the  preceding  values  of  r). 
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C  Properties  of  the  Statistics  of  the  Forward  and 
Backward  Residuals 

In  this  appendix  we  prove  some  of  the  results  on  the  structure  of  the  statistics  of 
the  prediction  errors  Etjn  and  Ft<n  and  their  barycenters.  The  keys  to  the  proofs  of 
all  of  these  results — and  to  the  others  stated  in  Section  3  without  proof — are  the 
constraints  of  isotropy  and  the  construction  of  specific  isometries. 

C.1  Proof  of  Lemma  3.2 

Let 

Gt,n{w)  =  E  (Fr-it, n-l(w)  \Et,n—l  )  (C.l) 

where  n  is  even  and  |u>|  =  n  —  1,  w  ■<  0.  We  wish  to  show  that  Gt,n{w)  is  identical 
for  all  such  w.  By  definition 


Gt,nW  =  E  -  E  (Yvry-lt  )]  )  (C.2) 

Define  the  set  of  nodes 

%,n  —  {5  =  vt-,  |v|  <  n,v  ■<  0}  (C.3) 

The  points  in  (C.2)  correspond  to  the  points  s  =  vt  in  Tt<n  with  |u|  =  n.  Let 

w1,  w"  be  any  two  words  satisfying  |u>|  =  n  —  1,  w  ■<  0.  Suppose  that  we  can  find  a 
local  isometry  <f>:  %>n  — >  Tti7lsuch  that 

~xt)  =  w"j~H 
<f>{w"'y~1t)  =  w' <$Txt 
(j>(t)  =  t 

=  %,n.x  (C-4) 

By  the  isometry  extension  lemma  <j)  can  be  extended  to  an  isometry  on  T. 

Consider  and  G^t^n(w")  which  are  linear  projections  onto  respectively, 

Et,n- 1  and  E^t),n-i-  Since  the  processes  Yt  and  Y^t)  have  the  same  statistics,  these 
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two  projection  operators  are  identical.  Furthermore,  from  (C.4)  we  see  that  <j>(x)  —  t 
and  i,  so  that  we  can  conclude  that  Gt,n(w')  =  Gt,n(w"). 

Thus  it  remains  to  show  that  we  can  construct  such  local  isometries  for  any  such 
w'  and  w" .  To  do  this  note  that  the  words  w  to  be  considered  consist  of 

W  =  X)  Wp  (C.5) 

p=j 

Wn-1  =  {7~"+1}  ,  Wn„2  =  {S7~n+3} 

Wp  =  {a,  (3}n-p~2S7-p+\  |  <  p  <  n  -  3 

where 

{a,/?}*  =  {m  €  {a,  13}*  ||m|  =  k} 

We  now  describe  a  set  of  maps: 

1.  <f>n- 1  interchanges  Wn_i  and  Wn_2  and  leaves  the  rest  of  %,n  fixed.  That  is 


(C.6) 

(C.7) 


K-x{l~nt)  =  h  ~n+2t 
4n-1(&y-"+2t)  =  7  ~nt 

^n-i(s)  =  5  for  all  other  s  €  Tt<n  (C.8) 

2.  For  |  <  p  <  n  —  2,  </>p  interchanges  Wp- 1  and  [Jglp  Wq.  Specifically  for  any 
such  p,  (j>p  makes  the  following  interchanges,  leaving  the  remaining  points  in  %>n 
fixed: 


S7~p+1t  7  ~P~H 

akS7~p+1t  7 l<k<n— p  —  1 

™a/3poik7~p+1t  «-»  ma057~p~k+2t, 

1  <  k  <n  —  p  —  2,  0  <  | mQ0 1  <n  —  p  —  k 

3.  For  each  f  <  p  <  n  —  3  and  any  1  <  k  <  n  —  p  —  2,  <f>p>k  interchanges  points 
in  Wp.  Specifically,  for  any  such  p  and  k,  <f>Pk  makes  the  following  interchanges, 
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leaving  the  remaining  points  in  Tt,n  fixed: 

mlpmli3Sl~Pt  *-*  mlpSmlpSi-n 

\mlp\  =  k  ,  0  <  \m2afj\  <n-k-p-2 

A  straightforward,  if  somewhat  tedious  computation,  verifies  that  (i)  these  are  all 
local  isometries  leaving  t  fixed  and  %,n-\  invariant,  and  (ii)  for  any  w',  w"  in  W 
an  isometry  satisfying  (C.4)  can  be  constructed  by  composing  one  or  more  of  the 
isometries  in  (1)  —  (3).  This  completes  the  proof  of  Lemma  3.2. 

C.2  Proof  of  Lemma  3.3 

Let 

Httm(w,w')  =  E[F^-it<n_l{w)EUn-\{w')}  (C.9) 

Where  n  is  even  |tu|  =  n  -  1,  w  ^  0  and  |u/|  <  n,  w'  x  0.  We  wish  to  show  that 
Ht,n{w,w')  is  identical  for  all  such  w,w'  pairs.  An  argument  analogous  to  that  in  the 
preceding  subsection  shows  that  this  will  be  true  if  we  can  construct  two  classes  of 
isometries: 

1.  For  any  w\,  w2  satisfying  |u?|  =  n  —  1,  w  ^  0,  <j>(wi)  =  w2,  4>{w2)  =  wi,  <f> 
leaves  Tt,n-\  invariant  and  leaves  fixed  any  point  of  the  form  w't,  with  |u/|  <  n, 
w'  x  0. 

2.  For  any  w[,  w'2  satisfying  |u/|  <  n,  w'  x  0,  tf>(w[)  =  ip(w'2),  ip  leaves 

invariant  and  leaves  fixed  any  point  of  the  form  with  |w|  =  n  —  1,  w  ^  0. 

It  is  straightforward  to  check  that  the  isometrics  <j>n- 1,  <l>p,  <Pp,k  and  their  com¬ 
positions  form  a  class  satisfying  (1).  To  construct  the  second  class,  let  us  recall  the 
representation  and  ordering  of  the  words  w  for  which  |?n|  <  n,  w  x  0  (see  (3.8) 
(3.9)),  and  let  wm  denote  the  mth  of  these  with  respect  to  this  ordering,  where 
0<m<22""1— 1.  We  then  define  the  following  maps: 
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•  For  each  1  <  k  <  |  —  1  and  each  0  <  r  <  2?  k  1  —  1,  tpkT  makes  the  following 
interchanges,  leaving  the  remaining  points  in  %>n  fixed: 

7 ~iWr2kt  <r+  S^k~^‘J~:’tVr2kt,  0  <  j  <  k  —  1 

Again  it  is  a  straightforward  computation  to  check  that  (i)  each  such  is  a  local 
isometry  (so  that  it  can  be  extented  to  a  full  isometry);  (ii)  xpkr  leaves  Ty- 
invariant  and  leaves  fixed  any  point  of  the  form  with  |tn|  =  n  —  1,  w  ■<  0;  and 

(iii)  for  any  «;(,  w'2  satisfying  |u/|  <  n,  w'  x  0,  we  can  construct  ij)  as  in  (2)  as  a 
composition  of  one  or  more  of  the  V’fcr-  This  completes  the  proof  of  Lemma  3.3. 

C.3  Proof  of  Lemma  3.4 

As  in  Section  C.2,  let  wm  denote  the  2^^^  words  such  that  |u>|  <  n,  w  x  0,  and  for 
any  two  such  words  let 


Jt,n{Wi,Wj)  =  E[Et,n(Wi)Etjn(Wj)]  (C.10) 

Let  n\  =  |tui |  and  n2  =  \wi\-  Consider  first  the  case  when  n\  ^  n2.  What  we 
must  show  in  this  case  is  that  Jt,n(wi,Wj)  is  the  same  for  all  pairs  Wi,  Wj  with  these 
respective  lengths.  By  an  argument  analogous  to  the  ones  used  previously,  this  will 
be  true  if  for  any  two  pairs  ( Wi,Wj ),  (rv'i}  Wj)  with  |u?j|  =  \w[\  =  n1?  |u?j|  =  \w'-  |  =  n2 
we  can  find  a  local  isometry  (j>  of  %,n  so  that  <f>  leaves  'T1-Hyn-\  invariant  and  performs 
the  interchanges 

W{t  *->  w[t,  Wjt  <-►  Wjt 

Direct  calculations  shows  that  compositions  of  the  ifrkr  defined  in  the  previous  sub¬ 
section  yields  such  a  local  isometry. 

Suppose  now  that  |u>,-|  =  |u>j|  =  n1?  and  let  s  =  d(wit,Wjt).  An  analogous 
argument  shows  that 

Jt,n(wi’Wj )  =  Jt,n( 0,Wk),  where  \wk\  =  s  (C.ll) 
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Again  an  appropriate  composition  of  the  tpkr  yields  an  isometry  leaving 
invariant  and  performing  the  interchange 

t,  Wjt  <-*  Wkt  (C.12) 

Which  finishes  the  proof  of  Lemma  3.4. 


C.4  Proof  of  Lemma  3.6 

We  wish  to  show  that  (3.53)  holds  for  n  odd.  Consider  the  first  equality  in  (3.53). 
As  before,  an  argument  using  the  isotropy  of  Yt  shows  that  this  equality  will  follow  if 
we  can  construct  a  local  isometry,  this  time  of  7i>n+i  which  leaves  invariant 

and  which  interchanges  the  sets 

{wmt\0  <rn<  2~  1  —  1}  (C.13) 


and 

<m  <  2221_1  -  1}  (C.14) 

where  as  in  Section  C.2,  the  wm  are  the  ordered  words  such  that  |«;|  <  n  +  1,  w  x  0. 
The  isometry  fps=it0  (defined  as  in  Section  3.2  but  with  n  replaced  by  n  +  1)  has  the 
desired  properties. 

Consider  now  the  second  equality  in  (3.53).  In  this  case  we  must  construct  an 
isometry  that  again  leaves  T^-\t,n-\  invariant  and  which  interchanges  the  set  in  (C.13) 
and  the  set 

{ury~'4||t(;|  =  n,  w  ■<  0} 

The  following  local  isometry  <f>  has  the  desired  property.  Each  element  s  of  Tt^+i  can 
be  written  uniquely  in  the  form 


s 


-ail 

2 


~pt 


(C.15) 


where 


n  +  1 
2 


<p  < 


n  +  1 


(C.16) 
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(C.17) 


n  +  1 

\ma,0 1  +  — - - 1-  p  <  n  +  1 

The  desired  isometry  is  then,  in  essence,  a  reflection:  for  s  as  in  (C.15) 

<l>(s)  = 


(C.18) 


which  completes  the  proof  of  Lemma  3.6. 
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D  Calculation  of  £“1/2(a0 . . . ,  ak) 


From  (3.35)  and  (4.42)  we  see  that  the  computation  of  £-1/2(a0  . . . ,  ak)  can  be  per¬ 
formed  by  a  simple  construction  from  the  inverse  square  roots  of 


£+  —  £(«*o,  ■  •  • ,  afc-i)  +  —  £(a0  +  •  •  • ,  a*:+i  +  o-k) 

£_  =  £(<20,...,<2*_i)  -ajtUjfe-x  =  £(a:o  -ak,...,ak-i  -  ak) 


If  we  introduce  the  following  notation 


Bloc  (X,Y)  = 


1  X  +  Y  X -Y 


2 \X-Y  X+Y 
Then  £_1/2(o:o,  • . . ,  <2^)  can  be  calculated  via  the  following  recursion: 


£  1/2  (<20, ,..,ak)  = 


a~1/2 

ao 


if  k  =  0 


Bloc  (£;1/2,  £I1/2)  if  k  >  1 
which  involves  a  sequence  scalar  calculations. 


(D.l) 

(D.2) 


(D.3) 


(D.4) 
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Translational  Shift 


Figure  2.2 


Figure  3.1:  Illustrating  Et,n  (dots)  and 
Ft,n  (squares)  for  n=1,2 


Figure  3.2:  Illustrating  Et,3  (dots)  and 
Ft, 3  (squares) 


igure  3.4:  Illustrating  Et,s  and  Ft, 5 


Figure  3.5:  Illustrating  Et,3  (dots), 
Fy-4,3  (squares), 
E8(2)t,3  (triangles),  and 
rr\2  (circles) 
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Figure  4.2:  Scattering  Blocks  of  Figure  4.1  for  (a)  n  odd;  and  (b)  n  even 
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